DOCUHBMT BESOllS 



ED 171 903 

TITLE 

IN3TIT0TION 

PUB DATE 

G.HANT 

NOTE 

AVAILABLE FROU 




CE 020 1«I1 



Evaluation Design and Rrp>rtinq in Carper 
Education. . * • 

Uffic2 of Caret-i: FMucation -(DHEW/CE) , Washington^ 
0. C, ' 

Jul 77 . 
GOQ 7604 3-29 

67p.; Soiflp paq?s in this iocumrnt will not 'reproduce 
tf->ll du-^ to broken typ^ , 

Superintendent (Ot\ Docuin^at^jV ^« ^* Government Printing 
Office, Washington, D, C. ^.0.^0^ (Stock. No, 
017-090-01906-0) ]/ 



MFbl/PC03 Plus Postage 
*Cire^ 

* Elem ^ntary 
al uat ion 
Int irvie ws; 

iiva^uation; ResPdcch Dosijn; Res^^arch Problems 



Education ; Educaii'lonal Ass>^ssin£nt ; 

Second a ry ?A u:a.ticn ; Evaluation Criteria; 
Mv thodo ; Evalu at i'cn N e-'^ds ; Eva luat'ors ; 
Measurement P^chniqu rS ; * Program . 



ABSTPACr 
final 



This document, product "of a r^vi^iw ot -i^hty-orle 
performance a nd e valuation rrports of ^^lemcntary *and secondary 
c\v^.<'iv *=ducaticn projects to identify iiaeled im provem'en ts .in career 
^'ducaticn program evaiaation, has four parts* Part I, a critique of 
ooaoior. evaluation pi^actices, covers overall evaluation designs, 

questionnaires and interviews, sampling for student impact 
::arch designs, descriptions of. evaluated programs. 



c? porting , 
i^L^ essmer. t. 



res- 

ou^conif^ in : asu;:ement instruments and st-r itegi :?5, and . statistical 
ar.aiyses and interpretation. Pari; il off ers solutior-s to common 
evaluation problems, first presenting, a problem and' then suggesting 
^-^lution. Part. Ill, addressed Specifically to project directors, 
ouqaes.ts how to g-t the most from a' third-party .evaluation. Part IV 
pres'^-nts a checklist and <^>xplanat ion of t'^rms used ^for reporting 
results of student outcome ^studi^^ in career education. The secfi^on 
^iso applies the checklist Vo ± fourth-grade model program tc 
determine the program's irapra'ct 'on -students , (IMS) 



* prod net ior.j: suppli-^d r;y EDPS are tli- b-^^.^^ tha'^ car. made * 

. . frcai th-^ original ioc»im . . ' 




EVAIJUAnCN DESIGN AND REPORTING 
IN CAREER EDUCm^ ^ 



July 1977 
(Reprinted, Jiily 1978) 



• Prepared under Grant Number G007604329 fron: ' 

Office of Career Education 

Office of Education 

U.S. Deparbnent of Health, Education, and Welfare 
• • • 

Project Title: 

Synthesizing and Cdtimunicating 

Career Education Evaluation Results 

Project Director: 

Deborah G. Bonnet, Director 
Research and Evaluation Programs 
New Educational Directions, Inc. • 
Crawfordsville, Ixxiiana 47933 ' 



. 9 



US DEPARTMENT OF HEALTH. 
EOUCATtON A WELFARE 
NATIONAL INSTITUTE OF 
EDUCATION 

THIS. OOCUMENT HAS BEEN REPRO- 
OUCEO EXACTLY AS RECEIVED FROM 
THE PERSON OR ORGANIZATION ORIGIN- 
ATING IT POINTS OF VIEW OR OPINIONS 
STATEO 00 NOT NECESSARILY REPR6- 
SENTOFFICIAL NATIONAL INSTITUTE OF 
EOUCATION POSITION OR POLICY 



DISCRIMINATION PROIJIBITED / . 

Title VI of the Civil Rights Act of 1964 states:, '*No person .in the 
United States shall, on the ground of race, color, or nat-io'nal origin^ 
be excluded ^from participation in, be deified the benefits of. Or be 
subjected to discrimination under any program or 'act ivity-^ receivings 
Federal financial assistance*." .Title IX of, the Education Amendments 
ofa972. Public Law 92-318, states: "No person in the United Stages 
shall, on the basis of sex, be excluded from participation in, be ^ 
denied the benefits of, or be subjected to discrimination under any 
education program or activity receiving Federal financial assistance.'* 
Therefore, career education%rojects supported under Sections .402 and 
4J)6 of the Education Amendments of .1974, like every program or activity 
receiving, financial assistance from the U.S. Department of Health, Educa- 
tion, and Welfare, must be operated in compliance with these laws. 



The material in this publication was prepared pursuant to a grarit from ^ 
the Office of Education, U.S. Department of Health, Education,- and Welfare. 
However, points of view or \)pinions expressed do not' necessarily represent 
pol Lci.es-or positions of the Office>^f Education. 



.'•nipiTinn-n,u..cUl ihxu..,. ni.s, L .... wovornnaN:t Pnfiling Ollkv 
Washington, D.C. :n>40J 

Stock Nunibo'r oi:-ON)-or« n 



Another publication in this series is: 

A Synthesis of Results and Programmatic 
Recommendations Emergijig from Career 
Ecjucation Evaluations in- 1975-76 




PREFACE 

Career education .evaluation has received an ^nctSaslng amount of 
attention in the past several yearSi Its laportfince has been emphasized 
clearly and its methodologies discussed widely. EValuatiorif^pf all edu- 

o , • * ■ . 

cational programs is, and probably will be, a high priority for years to >;SP^ 
come. Career education is now probably among the most extensively evalu- 
ated of ^^.S. educational movements. * • 

.r, However, ♦here !Ls j/oom for improyemen't in career education program 
evaluation. To. identify needed improvements, Si final performance and 
evaluation Veports of K-12 career education projects were* ^examined , ^ 
The publication that follows is a compilation *of four parts. Part I 
. is a discussion of common pitfalls in career' education evaluation design, 

rr . ' ■ ■ . 

execution, and .reporting, ^^rt II presents several high~quality evaluat-lon 

strategies identified "in reviewed* reports. Part III, addressed specifically 
- ^ . ' \. ' • « / • . 

, ' , • «- » 

to project directors, . concerns tshe effective utilization of contractual 

evaluation \sei*vices. Part IV, separate in earlier editions, is a checklist 

t 

and 'explanation of terms used for reporting results* of student outcome 
studies in career education. 

The Office of Career Education in USOE extends thaalLt to NeW Ed-uca- 
tional Dilrectiqns, Inc. , (NED) -a 'nonprofit service organization of Crawfords- 
ville, Indiana 47933 and to Deboijah G. bonnet, NED Senior Research Associate, . 
'lew Educatidnal Directions preoared this report under a grant from the 
Office of Career Education (G007604329).. ^ ' * 

- . • * — Kenneth. B. Hoyt, Director . 

" . • ' . ^ - , ■ ^ .. 

Office of Career Education 

• ■- . • - - J.11 - 



ERIC. 



'if . 



PrgiJectB Included in the rev;lew 

Evaluation ............. \ . 

Three types of evaluation 

Part I — Critique of Common -Evalua1;i6n Practices 

• Overall Evaluation Designs ^ 

^leporting \ . 

Questionnaires and Interviews ............. 

* " Mn^ ^ — ^ ' - ^ ^ 



Sampling fpr Student Impact Assessments 

Reseeurch Designs 

si' 

'Descriptions of Evaluated Programs 



4f . 



'Outcome Measurement Instilqments an^ Strategies 



statistical Analyses and Interpretation - .\ 

Part II--Unique Solutions to Common Evaluation. Problems 

Referenced Projects .......... 

^ ■ - 

Part III— A Note to- Project Directors on Hov to Get the Most 
Fran Your, Third Party 

Part IV — Checklist for Reporting Resultar^f Student Outcome 
Studies' in Career Education 



INTRODUCTION ^ 



The project of which" this report is one product was designed to 
encompass 108 career' education programs funded by the^U^H, Office of 
Education during the 1975-76 academic? year, ^he proiects, funded 
under two separate Federal programs, were r A7 three-year grants from 
the- Bureau of Occupational and Adult -Education pursuant to' Sectit)n lOA 
(c) , Part D ^of_'Public Law 90-567*; and 61 grants and contracts funded for 
one year by the Office of Career Education (OCE) under Section A06, 

Title IV of Public Law 93-380^. Ml Part D projects were designed to 

. ^ 

incorporate grades K-12, K-14, or K-Adult. • The OCE Drdj.ects included 
those ia funding Categories 1 ^^-12 incremental improvement), 2 (senior 
high school, settings)., 3 (spfecial populations) and one program for ,f 
adults f uncjcd in Category 5 (cooHBunicationg) . ^ , _ 



Hovever, all 108 projects were riot included in the review: Some , 
■ .not Vet submitted ^their final perf omEance and evaluation reports ' 
a« of o^r cut-off diate of. March 1^, 1977, even though the large majority^ 
of th« repoKs were due ■September 30, 1976. ; /t few. reports which were 
T9vimfi vwe not locluded In the analyses discussed this report 
because the programs were not designed- to develop and test strategies 
for c»r»«r «<iucating students; one -such program was for the develop- - 
swnt of a state plan for career education which was funded under OCE's 
Category 1. Another report in this series, l-rkat Does Career Education 
Do Pot Hda: A Synthesis of 1075-76 ^Evaluation Reaculta addresses only 
the projects reportrlng. student impapt evaluation resvdts at.the K-12' 
level which met the criteria .for inclusicJn In that synthesis. 

^ The liumber of projects falling under each of these circumstances 

is shown "beloy; • ' t . 

• • • 

. ' . c • -jOCE ■ Part D ' -Total ^ 

Proposed for review 6'l 1*7 • 108 - 

Final performance and . 
evaluation ^r^ports received ^ 
Ijy NED as 6f'3A/77. .Ugf . 32 - 81. 

Addressed in this rejKnrt V5 32 ' 77 

Met criteria for fjiqluaion 

in synthesis of student ^ ' * 

Jjnpact evaluation results. ' 21'. ', 26 ♦ U7 - 

Evaluation ' ' ^ y , - - 

* 'The term epaluatim^JLs defined very "broadly here to include act- 
ivities usually CQDtlderfd rmaaurem^t or aaaeQment. • 

We have made no" dlBtinct^ons between evaluative data and descrip- 
tive, data partially because what may he. purely descriptive to one per- 
son could he evaluative t^o another. For* example, a report of "the num- 
ber' of school staff members trained through a project may in some 
cirdumstancfes be an Indication of the project '/S success in sparking 
local interest in career education whereas in other case^'ithe same data 
would have no meaningfiil evaluative implic^ations. 'f 
^ Another reason for usin^ a^broa^' definition of evaluation is that 
data which. alon^ are only descriptive often serve as crucial components 
of evaluation systems and^iead to evaluative conclusions when inter- 
preted in conjunction with other components of the systems*. 



.. , Program eveduation has many purposes and serves the needs of a ntjmi- 

. her of • audience^/ but among 'its most important functions is its role in . *^ 
the responsibility of exemplary projects to develop .model programs suit- 

• able for replication in -other settings. This ^primary iaissi on of eill.^ex- 
emplary or demonstration projects' entails not only the development of 
programs and 'lines of communication with potential adopters of the pro-' 
grains, but alfeo includes providing "Enough ihforjnatlren about the model pro- 
gram to alifbw others to make informed decisions of- whether it should be • 
adopted in their schools. This means that the -potential adopter needs and 
deserves^ answers to t^ie ^ question, VDoes this program work?", but t^^he pro- 
gram^d3,effectivenes^^is of little concern if the question, "What is the' 

. prograjm?'\ cannot be answered. Thus, documentatioh of a program's effects 
is not sufficient; the probable causes of the- effects also must be doc- 
ument^d if exempleury programs are to have value. for schools /other than 
those \lhere they were ^developed. Since ptecise and quantitative descrip- . 

^-tlona of the taskg and resources' involved in installing a model program 
eoid In, .implementing it ar^e important components of , the evaluation system, 
it is important 'to consider these components fn aneilyzing career education 

vj^valu§;tion. ^ . ' > , 

>J!!hree Typ08 of Evaluation , ' ^ 

Evaluation Is generally considered to; be of, two types — process and 
' outcome or formative aiid svymnative. However, we have 'found three cate- ' 
gories to be more meaningful for externally-funded Irare^r education -pro- 
jects. The typical sequence 'of etvents for such projects is : l) The 
. funded project staff members develop ^gtrfa implemtot varibus strategies 
designed t9 bring about 2) educational reforin.. ^These cheuiges in instruc- 
tional-practices and in the relationships between the school and community, 
in turn, 'have 3) an affect on students. Depe^nding on hov it is viewed, 
( 2) -above could.be considered either^ a process or an outcome thus , the 
' use 6f "thr^e leovels of evaluation* rather than the usual two. Evaluation 
strategies are , categorized here as '1) project stratigy assessment ^ /2) 
educational reform outcome assessment^ or 3) student outcome assessment 
on the Jy^is of the level at which evaluation data have the most direct 
bQ^aring. Thus, student achievement 'results are, classified as student 
outcome assessinent data evitn thouglp, they may reflect the effectiveness of 



project strategies and the extent of educational reform. The illustration 
below demonstrates a typical sequence of events and the level associated 



with each/ 
Level 



Project Strategies' 

Educational Reform 
Outcome * ^ 



r^tudent Outcome 



^ E vent 

Project staiff develop training program. 
Teachers are trained. 

Teachers 'develop; positive Attitudes 

toweurd career education.. 
' Teachers develop career education 
^ knowledge and skills. 
\ Teachers implement career education 

^ictivities with students (Students 

experience career education activities) 

Students develop new skills, attitudes, 
and knowledge. v , - 



<9k 



Categorizing events and their aspocJ.ated evaluation strategies in 
^ this manner looses leaning ift cases where the funded stafrf members 
assume a major role in the direct delivery of career education exper- 
lences to students,, such as in most experience-based career education 
^ (E3CE) programs. However^ in .the va&t maJ^yrity of projects encompassed 
by this revie|;,the primary ^oles of the. funded staff were to design prq-^ 
grams and* to influence other educators cLnd community members to imple- 
•ment themj some "contact with students generally was maintained but thig 
was rarely central to the staffs responsibilities.. 



CRITIQUE OF COMMON EVALUATION PRACTICES 

Overall Evaluation Designs ^ 

Evaluation designs were in (most" cases multi-faceted, addressing 
numerous process, and outcome objectives. Virtually every project re- 
ported^at least one evaluation activity concerned with. project strategies 
euid educational refoiTn outcomes and the great majority (88^ of Part D 
and 78^ of OCE) included student impact assessment data. Furthermore, 
student impact assessments tenc^ed to i)6 comprehensive, focusing on / 
several objectives at each of several grade levels'. A. typical "student 
j^mpact evaluation consisted of assessments of three or four objectives 1 
at each of three or four grade levels. Nevertheless, the strategies apd - 
resources leading* to success or failure* in achieving student outcome 
objectives tfended to be inadequately defin^d^ 

Inadequate information -about 'project strategies and activities. A 
mafjoi" concern of potential adopters of exemplary programs and of other 
audiences of ' final reports is th^ lojgistics of project management, in*- 
.eluding such' matters as the skills re^quired of project staff members and 
the equipment atnd facilities needed for project operation. Information" 
about project strategies 'and activities is interesting in itself and. also 
serves $fcs a backgroui^;fo1: the interpretation of outcome evaluation results. 

fable r lists fouarteen aspects of management which are pertinent to 
all o-f the 77 projects' J^cluded in our review and which are likely to in- 
fluence -a project's suQcess in effectirtg educational reform and student 
impact. As demonstrated in the table, mariy of these project management 
variables rarely are mentioned' in final reports; even fewer reports them 
indicated in the table address .these topic3 thoroughly. 

We must point out,,, however, that the standards NED implicitly set 
for final re^rts by identifyinr, the kinds of information we expected to 
.find in the^ are our standards. The pro.lects ^eneratin^: the reports had no 
aocess to them. The federal govelmment had not endorsed them, not have 
they yet, nor are they likely to. Although our first reaction to^he 
finding that less than half' of the reports indicated the number of 
positions on the project s.taff was o^e of alarm, this initial response 
was ^empere'd by the realization that nothing in the instructions indicated 
that this information may be of interest to the readers of final reports. 



Both Part D and OCE instructions for preparing perfonfiance reports are 
, Ijy necessity in the form of outlines which identify broad topics defined 
in general terms. It would behoove project staffs a^d evaluators alike 
to recognize Ikat these instructions are the result of a complex series 
of compromises among several government agencies and that they represent 
only minimum requirements. 

An indication of the typical report's failure to quantify major 
project activities or to Address the quality of those effoFts'is see^n in 
discussions of school staff development in career education, which was 
a major function of. 76 of the 77 projects." 

Only 33 of the project performance reports and 9 evaluation reports 
discussed either the content or the logistics of staff training. The 
total number of training sessions offeited by the project was indicated 
in the project performance and/or evaluation report in 21 cases, but an 
unduplicated count \)f the school staff members involved in training was 
given for only 10 projects and in only 3 cases was the average amount*^ 
of training per staff member indicated. Slightly over half of the pro^ 
Jects' evaluations ir^cluded assessments of the quality of at least one 
training session either in terms of its affect on participants i knowl- 
edge and attitudes or in terms o|^their satisfaction with the session. 
However, total staff developmenli programs, which usually consist of 
multiple training sessions employing' several training modes," were 
asse8s,ed qualitatively in only 15 cases. Data indicating the quantity 
and quality of- other project functions such as community education and 
materials development are similarly sparse. 

' Inadequate information about career education activities resulting 
.from the propect*s efforts. The weakest element of evaluation designs 
. was usually in the identification of the amount and type of career 
education activities taking place. This problem took two forms. ' ^ 

Piifst, projects' educational refSrm outcomes rarely were assessed 
'thoi-oughly; only five reports provided clear indicatiorls of which groups 
contributed to the program's development, and implementation, '^-which 
■ways, and to >rhat degree. , This is particularly ironic 'for OCfi programs 
in Category 1, as their main focus was to be on demonstrating/incre- 
mental quality improvement diefined in terms' of educational reform out- 
. comes . ^ . • 



•7- 



Second, student impact evaluations tended to be unclear" concerning 
the ainount and type of career education experienced by the students vhose 
outcomes we^ measured. Thus, the effectiveness of many student act- 
ivities and programs was reported without those Activities and progroas 
being defined. ^ ^ ^ ^ 



I ' TABLE 1 

I 

I PROJECT MANAGEMENT 

ipercentages of project and evaluation reports which discussed any aspect 
!of the follovlng components of pr9Ject management: 



iListiqg of. funded staff positions 

iQualifications or backgrounds of 
jfunded staff 

iFunded staff training 

[coordination and communication 
[among funded staff/ 

Location of prjpjject's office(s) 
j(e.g. , in. a schtol]/ 

[Project facilities (e.g., office 
jspacey equipment) 

'Fiscal m^inagement policies 

Cost 'analysis beyond the required 
financial status report 

'Chronological plan for program 
{development 

Methbds of announcing the avail- 
ability- of project services 

Strategies for securing parti- 
cipation in project activities 

jLogistics and/or topics of 
jschool staff tug|^H.ng 



Composition ana/or accomplish- 
iments of advisory committee 

Obstacles or problems encountered 



Pro.lect 


Evaluator 




16% 


11% 


&% 


25% 


1% 


26% 




19% 


5% 


3% 


3r 


3% 


1% 


9% 


• lk% 


51% 


■ -,12% 


785 


m 


25% ' 


&% 


k3% 


12% 
* 


55% 




■ h5% 


23% 




The result of this veakness is th*t evaluation deaigna were often 
fragmented, consisting of discrete sets of data which had little bearing 
on one another. For •xaople, an evaluation BiBy consist of a description 
of a career education curriculum guide, pi^rtlcipanta • assessisents of a 
staff development program in the guide's proper use, and measurement of 
changes in atudents' decision-making skills. If evaluation results 
indicate that trainees were satisfied with the quality of the staff de- 
velopment program but that students' decision-making skills did not im- 
prove, it could be inferred that the activities suggested in the curricu- 
lum guide are ineffective. But this could be entirely incorrect, be- 
cause we would not know whether the staff development program led to the 
actual use of the curriculum guide, whether it was used as intended, or 
whether it was used in the instruction of the studerits who were tested. 
, Thus, we would not know whether to revise the' curriculum guide, the 
training program, or bothr all we know is that the staff were satisfied 
with the training program and that students' decision-making skills^id 
not improve. 

Look of transportable reocmiendations . As evidenced in Reoommenda- 
tiona for tha Inplementation arid Management of Career Education 'Projeate, 
project staffs and evaluators offered a number of transportable recommen- 
dations, as well as ones directed specifically to the project's locale. 
Nevertheless, ^many evaluations did not include- this final step in the 
evaluation process. Sctoe resulted in specific recomn'endations for the" 
evaluated project, as they should, but did not address the needs of 
potential, adopters of the exemplary program. Several evaluatora made 
comments such as, "Since the project was not refunded for next year, no 
recommendations are offered for the prpgram'a improvement", which seem 
to indicate tjhat the evaluatipn was intended to serve only local needs 
and to serve thea only s.o long as career education was supported through 
special funds. 



Intuffioient information. By far the most glaring weakness of career 
education evaluation it the quality of reporting and the worst prolDlems 
are ones of emission. ./Evaluatiion reports, on the whole, fail to provide 
enough information ahovit either the program under evaluation or-ahout the 
evaluation Itself to allow the r^^der to interpret ^and use the results. 
The prohlem is evident in the findings presented ahove and will also he 
addressed later. The Checklist for Reporting Fesulta of Student Outcome 
Studies in Career Education, a^iother document in this series, provides 
guidelines for reporting on student impact evaluations and was designed 
to alleviate that part of the prohlem with evaluation reporting. There- 
fore, we will not dwell on topics addressed in the Checklist and will try 
to keep our coraments on omissions relating to other kinds of evaluation 

hrtef. ' 

Organization. Another common,?^ hut hy no means universal, weakness 
of evaliiation reports is their organization. All too popular is the prd- 
fessional Journal format, where the report teirXris with a discussion of al] 
of the evaluation questions, proceeds to descrip-^ipns of all of^ the in- 
struments, ^hen to the evaliiation groups, data-collection procedures, 
results, and, finally, the conclusions. ThiWormat, although very rea- 
spnahle for a four-page Journal article concerned with one research proh- 
lem, can he rather awkward in a thirty-page evaluation report addressing 
ten evaluation questions. We are not going to suggest a new standard 
format because the point we ifish to make is that there is no one organ- 
ization that wlll^cover all cases optimally. However, each of the fol- 
lowing hases for organization has worked well in some cases: 

* Sections for project strategies, educational reform outcome, and 
student impact assessment. 

* Each program objective or component, or each evaluation question 
or hypothesis addressed separately* (So lohg as there is not ^ 
huge nianber of them. ) ' ^ 

* Each data source separately (Such as all teacher data together, 
all third-grade student data together, etc.). 



-T.0-. 



Three simple ihiles of thumb of report organization are: . ' 

Avoid the need for redundancy. There were cases where a single 
set of data vas presented and discussed in two or more places 
t>ecau8e it related to more than one section of the report as it 
was orgtmized. > , 

• Place related results in close proximity. For example, discuss- 
. ions of the type and amount of career education experienced by 
sophomores at Central High School should be easily associated 
with their- outcome results. Pre-test results should be In the 
same table or at least on the same page as post-test results. 
If a table or the text refers to questionnaire item 8, it should 
not be necessary to consult the appendices to find out wl9k the 
question was. - ^ 

» Include a table of contents. ^ 
Refererusea to ndn-appended appendices are remarkably common. 
Tablee wkLoh do not indicate what the marbere witUn them represent 
■ were also encoimtered. " 

Internal im!cmi0t0neU9 within evaluation reports take the forms 
of conflicting data (e.g. ' two different counts of the number of advi- 
sory council meetings); inconsistent titles (of people, tests, etc.); 
and conflicts between tabulated data and discussions of them (e.g., the 
comparison grouy scored higher than khe career education group according 
.to the ta^^ but results are di«cu|8ed as if they were positive. i^^_J 

Poor articulation between the ^luation report and the project 'e 
per/ti«wiui.,j^jgi|i.4^^ fk*! ^#^^j«{tort8 were often redundant, particularly 
in cases where birth reports discussed the activities associated- with' 
.each objective. ^ Still; inconsistencies .of the' types listed above were 
common between the t^^fx reports and laikirtant infprmation was often pre- 
sent in neither report. Better coordination between evaliiators and 
project staffs would make the two reports more complementary and the 
package more thorough. Some i^edundancy should be retained, perhlps in 
the form of each report containing an abstract of the other, but there 
is little advantage in-th« .raluator's -»e«onding'' everything the pro- 
ject says , or vice-versa. 

Statements that data wei-e collected, but no results given. Some- 
times the' reason for omitting the results of a particular evaluation 
strategy was given. It may be that ,a thorough report was prepared pre- 
viously and was not included In the final report for the sake of brevity. 
In these cases a synopsis should still be included if for no other reason 

.^fiS'^e the reader that the motive for the omission is pure. The 
•an* sboteldl 1)e don* if the data are not presented because they were 



-11- 



Judged to be of interest only to the project staff. „ 

N^gatlveJ evaluation findings should not be covered' up/ Besides the • 
ethical issue, neg^ive findings are useful and should be cqpveyed to 
others so that they may avoid t^e same' mistake. Negative findings can* 

also enhance the credibility of the positive ones. 

t» " " ■> , < < . 

Queatiormaivea and InievHieua 

Almost every evaluation involved" the use of tailor-made instruments 
for gathei'ing data such as community members* attitudes toward career* 
education, educators' degree of implementation pf career education pro- 
grams, or students' opinions of career education activities. ' Fdr the most 
part these evaluation components were well-conceived and well-executed, 
but . i9trii|*ew errors were committed .repeatedly. . 

Poorly worded questions or response options. For example, this 
questionMs ambiguous because it defines neither ".career education 'act- 
ivity" nor the time frame for the answer: 

How many career education activities have you carried out? 

This item, thotigh p^fectly clear to. fhe adult-level reader, may have 

been over the heads of the Junior<q».high students who were asked it: 

Indicate your feelings abQut the tine allowed for adequately 
comr»le.tinr; laboratory activities. 

a. too much • - 

b. Just right - - 

c. not enough 

The problem with this one is not so miich the question itself, but 




the meaning of the answer: 

Do you understand that career education encompasses alZ<' educa- 
tion: professional, technical, vocational? ^ 

a. yes * 

b. no - ' 

incomplete information. Questionnaire results should be accompanied 
by a list of the questi^ns^^and an indication of- the group to whom it was 
administered, the number who completed it, the date of administration, 
and, if applicable, the return rate. The same types of information are 
needed with interviews. The interview guide should be included and the 
'methodology should be described, at le^st to the extent of indicatir\g 
whether interviews w^re conducted by telephone or in per^son. 



AmlyeiBB of reeulte. Most queptionnaire and Interview dat^a should 
te reported-ln terms the responses to. each question; tota^ "score!" 
are meaningful' -bnly , in, special' cases such as ^th some attitudlnaL , 
•que«tionnaires. -In, deciding how much detail to report the ,ol)Ject^ is ' \ ' 
to give enough informatloTto make meaningful interpretations with- 
out hombwding the reader wiljh page after page of numbers. ^(W' 'extreme 
to avoid is illustrated by a case v|jere a^2d^item opinion survey was 
given to 1»5 individuals. Results were aVerage<k across -respondents and, . 
Items €Lnd° reported simply as "8lf. positive". The other extreme is to. 
report the responses to each question for each of ten or so groups sep- 
arately. Information is lost by combining some of those 'groups , but the 
information that rempdns is more easily interpreted. ;If results are 
presented for each, of several groups (such as elementar^, middle schQol', 
and high schoorl staffs), the number of people giving each answer-should 
be converted to percentages to facilitate comparisons^ across groups^ 

Open-ended questions are good in some respects, but the re||tiits are^^^ ^ 
difficult to deal with. If they are used, time and resourc^ must be 
allowed for organizing the responses and identifying trends within them. 
The other alternatives, both of which were taken, are to present lengtl^^ 
lists of verbatim r^arks or to report simply, "The evaluation team inter- 
viewed thirty-two members of the community resource pool and learned 
that, all in all, they aj-e* enthusiastic about the program." 

' * "* " 

Sampling for Student Impact Aeeeament | - 

A sample is a subset of a population. Samples are used in most 
"research and evaluation activities because testing everyone in the ' 
population is expensive, inconvenient, and/or impossible. So that the 
results of research can be used to make predictions for the population, 
procedures are used to ensure that the sample is a subset of the pop- 
ulation and that it resembles the population as much as possible. 
Therefore, a fundamental consideration in sample selection is, "To what 
population do we want to generalize?" In prder to answer this question 
we must first establish the purple of the^ evaluation- and also of the 
program. ' ' .\ 



18 



Exemplary or demonstration projects auch as career education pro-^^ 
JectG funded by the USOE have one primary mission: to deyelop programs 



which c/h be adopted or adapted by other schools to fulfill needs common - 
to many.y^r all American, school systems. This mission implies many re- 
sponsibilities for exemplary programs, one of them being to provide 
potential .adopters of their program with sound bases for deciding whether 
to try them in their school system^. That Is, a school administrator in 
California should be able to predict whethdn:- a program' developed in 
Arkansas will work in his or her school on rhe basis of the Arkansas pro- 
gram's evaluation results. / ' . - . 

This is equivalent to saying that the students who are chosen for the 
Arkansas evaluation should' be' representative of a national population, such 
as American high schopl students, gifted fifth-graders, or rural low-in- 
45dnje kindergarteners. , One way — in some respects the best way — to draw, a 
representative national sample for testing the program's effectiveness is 
to choose randomly from all American rural low-income kindergarteners 
forty or so children to undergo the career education program. To the best 
of our knowledge, this has never been done. Fortunately, there are other 
ways to meet the assumption inherent in all research that the sample 
I'enreoents the population to which its outcomes are to be generalized. 
One of the most feasible ways for field research activities to meet this 
assumption involves a somewhat backward logic — students are '^selected" for 
the exemplary career education prograir^,for whatever reasons (such as their 
teachers' interest in career education), then the population which th9se 
students represent is 'identified by ' describing the group's characteristics 
(e.g., low-income rural kindergarteners). A valid way of evaluating such 
a program would be to measure its effects on the students who were "se- 
lected'' to participate in it and^ to describe precisely what the program 
consisted of and which national population the students represent. TJien 
the program's potential adopter can predict its effectiveness for meml^ers 
of his/her student body who represent its same population. Of course, the 
larger the population to whom the results 'can be generalized, the more 
useful are the results to the more school "systems. That is, a program 
which is found effective for a group of students representing all achieve- 
ment levels may be more appealing than a program which h^ demonstrated 
its effectiveness only for students of below average achievement. 



It is widely believed that in order for career education pesUl,^s 
tb ire generalized, the students involved i>r testing Tnust^e selected' ; 
randomly from the school system impljen^enting career education^ 'This., 
is true iT the population of iftterest is students in. the system. If- 
the population is national^ students can be jchosen randomly from with- 
in in .the^ system but thVsaiaple is , still far frofe a random sample of the 
national' population; for all practi^c^ purposes, this sample vouJ.d be 
' more random from a national perspective than a very deliberately 
chosen grpup, within the schopl system. We do not wish to imply that 
eyery- conceivable sampling technique will yield valid evaluation re- 
suits, but rather that the practice of testing raqdbmly-s elected 
students from the school district or from schools or classrooms 
"participating." in the career education program is not the only valid 
sampling plan for career education evaluation. 

This sampling strategy has, nonetheless, gained a great deal of pop- 
ularity in careef education evaluation, perhaps because^? career educa- 
tion's intent to serve ^1 students and because career education is more 
often infused into the curriculum than taught as Ik separate course. Let 
us examine what evaluation results tell us if We test, say, 50 sixth- 
graders selected. /at random from four elementary schools participating in 
the district's career education progrfeun. We will assxime that a school's 
peurticipation means that all of its teachers have received some^minimvmi 
amount., of inservice training and that the project's material/and human 
resources are available to the school. We still cannot assiine that the 
50 children htive experienced comparable career education. For that mat-^ 
ter, we should not assume that 'all 50 children have experienced career 
-eSucation ^f any amount or type; to do so would be to ignore the widely- 
observed tendency for some teachers to reject the concepts of caree^r . 
education while others embrace it enthusiastically, for some teachers to 
emphasize self-awareness while others concentrate on* career information, 
for some teachers to rely heavily- on class discussions while others 
prefer field trips. Since the only reasonable assumption we can -make 
about the career education experiences of these 50 students is that they 
probably differed .a great deal from student to student, the results of 
the evaluatiQj? would tell us very little about the effectiveness of any 
particular set of afiudent activities. What the results would indicate ^ 
is the changes which can be expected in students whose teachers have * 



access to, the project's resources. That Is, ^Lt is riot really a. student 
program wttjch is being assessed, father, the evaluation results pjre an 

. indirect fceasure 6f the educatlongil reform V^sulting from staff develop- 
ment* and other project strAtegles. . ' ' • 

^ X^is sam:pling approach has its value under certain circumstances. 
The questions these evaluation resultjB answer are, "What has the career 
education project done for our scho6l system's students?** and, **If 
another district adopt^s our staff 'developmeot and supportive services' 
systems, what student benefits can be anticipated?*' -The reason this 
sampling app^roach is listed here 'as a common weakness of career educati^ 

' evaluations is that it is often applied prematurely, before a more basi>c 
question is answe]^ed: **Does our newly-developed student program work?**, 
or, **Are these really. the experiences we should be asking educators and 
cormminity members to provide for students?** 

• We cannot determine whether the , program works unless we know that 
it has been taught to the. stiidents who are selected for testing the pro- 
gram' s effectiveness. To mal^ sure that tested students have experienced 
the progre^ we cou3j& begin by selecting jrtudents^to participate in the 
piloting a new program and then measure its jiipact . on them.. Another ^ 
approach that is often more practical, especially for projects where the 
individual steacher (or counselor student , etc.) is left a good. deal of 
latitude in designing the student prograiQ, ds to select for testing a 
group of students who have experienced the ^rog3:an which most clearly 
resembles the project's model. If the model program is a^^parate course, 
sample selection is a simple matte'r of Identifying the course's enrollees 
and' testing either all or a i^andom sample of them* If the mode^ program 
i& a form of classroom infusion, thp evaluation sample usually consists 

^of the students of teacjtiers who have implemented" the infusion curriculum 

to the greatest degree. Of course, students seleqted in this msjiner still 
^ /WI3.I not have experienced exactly the 'seime ^et of career education act- 
ivlties, even if' they are drawn from a si-rigle classroom, 'but ^variations 
in their experiences will be much smaller than those of students ^drawn 
randomly from' *'p6urbicipating** schools.. *riie rdsults of ev§Lluations in- 

^ volvlng only ** l^tfehtitfied as treatecf^ students, like those of evaluations 
using random Selection, are general! Zabi,e to the population which the 
sample represents euid tell us what impact the model program has where it- 
ie applied ' rather than the impact of the a^A2Lability bf resources for 

21 



applying the program,' where those resOurpes may- or may. pot^fe "li^ 
^The.Jistinctipn, between thl^tWo sanpling a'^roaches loose? ijbs m^anlftg 
ifcall sttidents in pai;*icipat|ng schools receive, comparable career ed-%^ 
ucation 'instruction, but thisLppears to be a relatively^ rare c^ircum- " 
stance. ' ' ' * * . ^ . . \ p 

- In What Dped Career Educatipn Do Fdr Kids?, '569 student 'outcome 
studies are synthesized, where ^a ''study*' is an assesOTient of on^.out-' 
^ome^objective at one grade lev^l. ' Ojily 169 of these, studies employed 
'"identified as treated^' eainpling plans, even though most of the pro- 
jects generating , thesis studies were in relatively early,- stages of pro- 
gram development, where* the priority "question should concern the 
viability of' their student programmatic models. It should. be no sur- 
prise that these studies were t-hree time more likely than studies 
employing random selection to show statistically significant positive 
results (60 + 169 = 36?$ vs. hS + Uoo = 12%). These, results demonstrate 
pne Advantage' of evaluating programs according to their impact on 
3tuder\ts who are known to have experienced them. 

Research Designs ^ , * 

Over^relianoe on national norms. One-time testing of "a single 
group of students who are comparpd to national norms may result in use- 
'ful needs assessment -data, but the results are of very limited utility 
in evaluating the impact of ^ program. Drawing conclusions from results 
such .as "after career educa^tion, the students scored significantly higher 
than th^ national mean" requires assuming that without career education, 
the evalu^Ltion sample's mean would be the same as the national pop-' 
ulation's.j This is a questionable assumption for two reasons. 

The f{rst is hat "national noms" are npt really national. Some 
standardizei^tests are nofmed on> larger and more representative samples 
of national populations than are others, but all are subject to errors 
in estimating the population's score distributions. 



/ *Many of the'points piade hereland in a later section on statistical 
analysis aife , addressed more thoroughly in the excellent: publication: A 
Piy£ctiaal (^ide to Measuring Project Impact on Student ^Achievement , ty 
D-: Horst, K\ T^lmadge, and Wood of RMC Research- Corporation, 1975', 
ERIC number ED IO6376. - - \ * 



The second problem is "tlmt inhere is no reason to expect any given 
sample of student's to m^^ch a nation^ population unlessv^the sample is 
dra\m randomly from. the national population. Even if every third-grader'- 
'in a school district^ is -test'ed^^ their mean score is likely to be sig- 
ni-ficantly different from, the , fiHieth, per cenlfi^^e because nei|;t^^the 
community nor the local educatioAalt system is likely to he ^'typically 
American'* in every relevant way.. If th^ evaluation sample is fairly ^ 
small or if it is hot representative of the district, the chances of its 
matching the national population are reduced further. ' 

However, norm-refer,enced N^^ests ha:ve advantages when used/ in other 
evaluation designs whiere there is no need to assume that the students' 
scores would match the. national norm without career education, such as 
in studies involving pre-post without comparison group data. This use of 
norms still- involves assumptions concerning non-carefer-educated students' 
expected post-test performance relative ^'to their pre-test performance but 
these relative assumptions are generally more defensible th^n absolute 
ones. This is somewhat parallel to the generalizabili]|iy t)f evaluation 
results, in that the educator in Call^rnia can reasonably expect the 
program which ^resulted in improved student learning in Arkansas *to result 
in improvements in his/her school systemeven though the actual test 
scores of the California and Arkansas students are likely to differ to a 
statistically significant degree.* 

Excessive use of the pre-pdst without aompqrison group design. If 
the objective Qf^t^e evaluation is to achieve statistical significance, 
this is the design to use. But if the objective is -to determine th^ 
impact of a career education program, other possibilities should be con- 
"sidered. 

Reasonable cause-ahd-ef fe.ct inferences can be' drawn from studies of 



' ^Incidentally , if a local sample is compared to a nati onal norm « the 
Z, not the t_, should be used to test whether the sample mean is signifi- 
cantly different from the population's. Since the t^ for independent > 
"groups compares two samples rather than a sample to a population, its ise 
implies ^a rejection of the hypothesis that the norm represents the pop- 
ulation's score distribution. Although it may be valid to i^eject that 
assumption, this rejection also says that the validity of the .evaluation 
strategy is rejected by 'the evaluator. 



^ pre-po8t growth .under any one of the f oHovfug circumstances • 

, 1) pttfer research indicates. that growth is unlikely without special 
intervention. For exanple, results presented in f/hat Doe^ Career Ed- ' 
ucation Do For Kids?' suggest that self-esteem is not likely, to improve 

, o.ver a period of six or so months even with career education, so a pre- . 

X^oat .without comparison design-'would be a conservative one for evalua-Wng 
this outcome. 

2) - Logic 'indicates that growth is unlikely without intervention." 
For example, Job-seeking' skills can be considered- to be of tVQ types. ' 
Some of these skills, such as the preparatiqn of V resume, are purely 
cognitive,, and it is reasonable to assume that they wi:i,l not be learned 
without special efforts. .Interview skills, ori-'the^other hand, consist^ 
largely, of poise and communication skills which cpiild be a(^quired through 
routine experience,. , _ ' . . , ' • 

3) The tiTne between pre- and post-testing is extremely short . 
For exainple, the design would probably siiffice for a two-week mini- 
<!ourse in career education. ^ ^ • 

We recognize the difficulty of securing th^ cooperation of the 
comparison groups needed for the more conclusive research designs. None- 
theless, the problem most often citedV as a rationale for designs without 
comparison groups is that all students\in the district have been exposed 
,to career education either directly or indirectly. This is a problem, 
certainly, but its magnitude often seems to be exaggerated^ 

It is true that if some students in a school system expedience 
career education, all are likely to receive some amount via their 
rriends, bulletin boards, small changes in their teachers' practices 
resulting from contact with invo^rve^ teachers ,^ etc. .But to say that 
this minimal career education exp^^^e destroys these students' utility 
aa a comparison groups is to say that the minimal experience is equally 
or nearly. as effective as an intensive, high-quality career education pro- 
gram. Accepting that assimiption puts us in a poor position to -ask 
educators and 'the community to devote significant efforts to the imple- 
mentation of career education programs. 

Only if career education is well-established in the district and vir- 
tually every student or every student in an identified sub-population 
(such as gifted children) ^is involved to a large de^ee should it be 
impossible to identify a reasfinable comparison group within the district. 



• ■ " ' ■ ' . ' ' [ . ^ 

It may be necessary to sottle for coikn^rtng the effects of a lot of' career 

education. to the effects of a little rather tjian to the effects of none, 

but l*f the more intense program is more effective and the evaluation Is 

designed' to be sensitive to the differeni^, this is not a serious 

* ■ ' * 
compromise. Jt ^is much easier to identify a comparison group If th6 

^'identified, as tr'eated" ^sampling ajjproach is' used than wheiJe students /are 

'selected at random from participating schools., (M^en, students of teachers 

wHo have participated in some staff development activities but who have 

not become active in career education constitute adequate comparison 

i^roups. . V 

H Use of comparison groups whose avedihility has not been estal^ished., 
""The credibility of many evaluations could probably be enhanced through 
^demonstrations of the comparison' group* s com^barablllty to the, career edu- 
cation group. The Checklist discusses several ways of doing ^is. 

Descriptions of Evaluated Programs . 

Not specific. Although most final reqjprts Indicated in general terms 
the nature of the student programs who'fee iinr)aqt were assessed, relatively 
few were specific about the kinds of activities comprising the programs. 
This sei^ously compromises the utility of " evaluation results outside of and 
perhaps even within the school system. 

Not quantified: Few evaluation reports provided insight into the 
question, /'How much career education did It take to produce these student 
oiitcome results?" • _ 

.Not matched to outcome results. Sometimes both treatment and out- - 
come data were presented , ^ but it^ was still impossible to identify the' 
career education activities of each group of students whose outcomes were 
measured, and thus to determine which set of activities was associated 
with each set of outcomes. - ' ' 

It may^o^ten be artificial to identify for a single group of students* 
the particular activities which produced each outcome. 'For example, a 
-seventh-grade program may consist of field tri5ps designed to convey 
career information, role-playing to clarify work values, and filmstrips - 
to develop decision-making skills. However » explaining the evaluation / 
^results of each of these three outcomes only in the terms of the activity 
type which was designed -with that outcome ir^mind probably would be overly 
simplistic and involve overlooking interactions eunong the ^three kinds of 
activities-. That is^ If success were demonstrated in decision-making 
skills but not in career informations one might conclude that the filmstrips 



yerelpffective hut ' the fi^ld^trips vere not. But it could be that 'I 
i elinfitiating the field trips would reduce the pro-am' s impact on • f'" 

decision-making skills . The point is that whereas it is appropriate 
to di soul's the . rat ipnale for eachMjpmponent of the student program, * 
outcome results .generally shoul^ be interpreted for the program as 
a whole for any given g;roup o^ students. 

. However, t'he outccanes of seventh graders 'should be associated 
with-the acti^ties of theJ^^g^e sev^snth graders, particiilarly If the , ' 

"identified as treated'*' sampling^^an is used.' Data indicating the*' 

♦-^ ' • ■ • 

average amount^ and type of career education received by all seventh 
graders in the~^ school system will not suffice in interpreting the 
outq|6me re.siats of students who were chosen for the evaluation be- 
cause they had undergone a presumably above-aver%ge program. Similarly, 
treatment data which are aggregated across graie levels cannot be used 
to interpret the outcomes at any one grade level, regardless of the 
•sampling plan. For example, knowing that K-12 students receive an 

.average of three hours of career education instruction per week may 
be 'useful information, but it gives no insight into the amount of career 
eduoation required to produce the accelerated reading achievement ob- 
served ^ second grade unless we assimie that second graders deceived the 
same amount of career education «as high school Juniors and kindergarteners. 
As another exainple, knoY^-ng that 93'guest speakers visited' elementary 
achools is in itself meaningless as. treatment data. If a random selection 
saiipling' plan is .used tcJ' evaluate fifth grade impact, the average niimber 
6f visits to -fifth-grade ci^srooms should be indicated. If three fifth- ^ 

"^grade blassroonis are chbsen for evaluation in an ^'identified as treated'V 
sampling plan, the treatment would besjb be described in terms of the num- 
ber of . guest speakers visiting each of the three classes. 

Outcome Meaaurem^ht lnatmoTienta and Strategies 

The choice of measuremerit^lnstrumentj^ is perhaps the single most im- 
portant determinant of whether eValuation^ results will reflect .the pro- 
gram's iin^act on students. Nonetheless, this decision often seems to 
receive less attention than it deserves. Career education outcome 
measurement posee poasiderable difficulties ^but new ^.nstruments and 
strategies are emerging and the problem is., becoming less serious than 
it was several years ago. Also, the option of locally-developed instru- 
ments should never be ruled out as a possibility; many programia^ have done 
this with great success. 

I ' 2C 



-2X- 



In "choosing instruments, the first question should be, "Does it 
measure what we want to evaluate?" This question must be ^^swered 
locally, .and by career education practitioners, not by '^Vkjhiators alone. 
As a f?;eneral rule, the best test is, one wliere^ students who^ give "correct" 
aasvers repreisent what the program is attempting to produce. There are 
• other considerations, oT course, etnd expert opinion and psychometric 
data should be considered, but instruments chosen solely on the basis of 
how many other programs hav^ used them or what "experts" far removed 
from the local program say about them are likely to result in dis- 
appointing evaluation results and invalid conclusions about tjhe pro- 

graun' s .effectiveness* * 

^ - ' 

A number of programs-were made keenly aware of the need for careful 
selection of instruments a,fter negative evaluation results were found and 
their examination of the tests revealed that they had little to do with 
the program's objectives. Some of them recommended, as do we, that in- 
struments be scrutinized before, not after, their administration. 

The rationale for measurements strategies s-hould always be indicated 
in some'way, but this is especially important if the technique is unusual. 
A few examples of cases where the rationale was not 61ear are : 

^ Even though career 'educators have emphasized time and again that 
they do not expect pr desire young cjiildren to moke career choices, 
three projects evaluated the 'wisdom of fourth-graders' caireer plans 
^In one case» .even the rationale of the measurement technique was 
unclear — the student stated his or hdr ideal and realistic career 
choices and ^Judges rated the difference' between the two on a 
socio-economic scale, hoping that ideal and realistic career 
choices would show similar socio-economic status. We" considered 
* the measurisment technique of the o^her two cases sound--Judges 
' , rated the compatability of the student's top three career choices 
with his/her responses to each of thirty questions relating to 
personal interests, abilities and values—but- assessing the 
. . validity of fourth-graders' career choices in any form still seems 

of questionable value. 
* A program which emphasized the elimination of sex-role stereo- 
typing employed an assessment technique which appears incompatible 
' ■ with the, intent of such efforts. The test consisted of a list of 



-22- , 

Job titles, -some male-stereotyped and some female-stereotyped. 
Studeqts were asked to indicate whether each Job was predom- 
inantly male, female, or dominated by neither sex. Results 
were "positive" in that students tended more toward "dominated 
by neither sex" after. the program^ but these results ^could be 
interpreted to mean that misinformation. ^had been conveyea 
rather than* that students had become <%oVe aware of and less 
subject to the; inequities of the status quo. 
* One project administered occupatione^l interest inventories 
to middle school ^ students on a pre-post basis and found that 
students' interests changed. It seems that this result could 
be anticipated with- or without career education and the implicit 
assumption that interest changes are inherently desirable is 
debatable. / * 

Although the direct approach to measurement is often the best, 
it can be^erdone. Several programs asked students outright wheth;^r 
they had attained the program's outcome objectives; this was even done 
third grade. - Only slightly more subtle than asking students, ''Do 
6aiow about a lot of different careers?" is the approach of asking 
stunts to report how much they know about each of a^ number V^^obs. 

"ests should measure the career education concepts which (-le pro- 
gram is designed to convey, but when the content of test items matches 
. precisely the cotent of instruction, test results. can become trivial. 
For example, one locally-developed test of occupational knowledge 
contained a photograph of* a local business establishment which the 
career education .students had visited. The fact that *moire career educa- 
* tion than comparison students could identify the type oi^-'mtBiness that 
takes place in the building is not very convincing, evidence t^at the 
field trip was worthwhile. , ' \ ) i 

,^ Inaccurate descriptions of what tests measure are /very c/ramon, but 
the most popular is confusion between self-esteem and (self-fi^waren^ss. 
(e.g., "Seif-awareness was measured with the Cooper smi«l-S^lf-Esteera In- 
ventory.") Self-awareness is an i^r^^rt^nt career education outcome and 
probably the most challenging to measure, so we do not quarrel with the 
practice of measuring self-esteem in lieu of self-understanding. How- 
ever, the limitations of measurement should be recognized; as overlooking 
them leads to faulty conclusions. 



-23- 



Some evaluations employed a single commercial test scale fox eval- 
uating Impact vith respect to two or more distinctively different ob- 
Jecti'ves, such as. decision-maj^ing jgkills and career knowledge. If the 
scale really measures both; it probably should not be used. If it 
measures one to a ^eat degree and the other to a small degree, results 
should be considered an evaluation only of the dominant skill area, 

ailthough the scale's description in the evaluation report should be ^ 
» 

accurate. 

Some locally-developed tests addressed all of the program ^s out-'' 
come objectives through a sinifjle scale yielding a single score, making 
evaluation results' difficult to interpf^'. Locally-developed tests 
usually have the advantage of being closely related to local objectives, 
but items addressing different objectives should be on different scales 
for evaluation results to indicate which objectives were achieved and 
which were not. \ ^ 

Measurement techniques for each of the USOE learner outcomes for 
career education are disV^ssed in" What Does Career Education Do For Kids? 
and all of the instruments used -in synthesized student impact studies are 
listed in the appendix along with the number of studies in which each 
was used and the number where positive results were obtained. These data 
can be used as one consjideration in choosing' tests , as they provide a 
gross- index of sensitivit^r to career education instruction. It should be 
remembered, though, that evaluation results are infl\ienced by many factor 
other than the quality ot the- measurement instrument and that the results 
other programs demonstrate with a particular instrument should be only 
one of many considerations in iristrument selec^^ion. 

^Statistical Analyses and Interpretation 

■No inferential analyses applied. Very often data collected from 
educators , 'community members, parents, etc., could have been analyzed 
inferential]j^ in meaningful ways but were not. However, we do not see ' 
this as a very *^serious problem in" most cases — the considerable effort 
involved in performing everyr reasonable analysis of all available data 
would he appreciated by few. On the other hand, it seems rather waste- 
ful to launch massive student testing programs and then to report only 
mean scores of students before and after the program or of career 
edxication and^.cispiparison groups. 

• . * ^ — * " » ? * 

1 

a 



Vague references to etatistiaal analyses. This weaknesB may fall 
into the categpry of incomplete reporting, but the problem could be 
more basic than that. Statements such as "differences were (or were 
not) significant," with no other discussion of .inferential statistical 
.^alyses, raise doubt s-' as to whether analyses were .conducted. State- 
ments such as "results- were computer-analyzed," when made without fur- 
ther clarification, Qan leave the impression that' an analysis was 
conducted but that the report writer does' not, know which one. 

Poor choice of analyses. Almost any set of data ca^i be legiti- 
mately analyzed in more than one way: and ^ften the analysis - performed; 
■was not the one Of greatest interest to the most people. .In one case 
student test results were factor-analyzed but no t-test was applied to. 
. deterWine the sign.ificance of pre-post differences. Much more often, 
• results were analyzed' separately. for each of several classroom units 'at 
a given gr^de level or differences between classroom units were analyzed, 
'even though there was no indication that either th6 students or their 
career education experiences differed from class to. class. there is 
local interest in such analyses, there is nothing wrong with this prac- 
tice, but a summary analysis including all of the^career education stu- 
dents together should alao be conducted for summa^ reports and rationale 
should 1^ giv6n for separate or comparative ahalyses by class. 

Other poor choices of analyses were made' in some cases where pre- 
. and post-scores were available for both. career education' and comparison 
groups.- Th6 best ^se of these data usually the analysis of ' covariance 
or a t-test on-posf-test results after a t-test on pre-test scores reveals 
no initial differences between groups . However, these tests w^re some- 
times omitted even where several ot?her t-tests were applied, such as 
matched-pairs t's for each group. '•' 

^ ' ^8use and wisintevppetation of the analyks of oovariarice. The 
analysis of covariance is a useful statistic and very often the best 
one to use with pre- and post-data from career education and comparison ' 
groups. Nonetheless, it has been given more credit than is due. , ■ " 

The first and most serious fallacy is that it compensates for 
significant pre-treatment differences between groups. Unfortunately, 
it only capitalizes on the correlation between students' pre- and post- 
performance, thereby reducing the error term of T. This makes the 
analysis ^of covariance a very powerful statistic, but it does not elimi- 
nate the dif ferences In expected growth^ rates of students who are drawn 



,froin different pre-treatment populations and does not eliminate the 
necessity of choosinp; a comparison group which is comparable to the career 
education r:roup. If the two groups differ significantly on the pre- 
test, a more elaborate regression technique is needed. 

Another fallacy about the analysis of covariance is that it tests 
between-group differences in growth rates between tescing dates, or that 
significant differences indicate that the career education group gained 
more than the comparison groupV= In reality, it tests differences between 
groups on the post-test, as does the t_ for independent groups; the differ- 
ence is that post,-test .-scores are adjusted for- random pre-test differences 
and the analysis of covariance is more likely than the t^ to reveal sig- 
nificant^ differences/ The misconception that the analysis of covariance 
tests for dif ferences^'^between groups' gains would not be so serious if it 
were not for the common belief that it also corrects for large, initial 
differences between groups—if the%)re-test means are equal, the difference 
between mean gains is equivalent to the difference between post-test means. 
However, the two misconceptions usually feo hand-in-hand as in the case 
below where a t-test vas' applied to pre-test scores and its significance 
led to the decision to apply the analysis of covariance rather than the 
t to final results: 

The significant differences *between the experimental and control 
groups exist in the areas Of mathematics achievement and career 
' knowledge. Since the final evaluation will be based upon gain 
scores, these differences are inconsequential. 

Use of the wrong t-test. Just how oftep the t^ for independent 
f!;roups was applied to pre- ^nd post^-scores of career education students 
is difficult to say because few ve^^bvi^ specify which t was used. How- . 
ever, in c^ses where the N's associated with pre- and post-means were re- 
ported unetiual, it is reasonable tc assume that the wtong t was 
applied. ^ . . ^ 

9 The appropriate t for determining whether the difference between pre- 
and/post-means of the same students is significantly different from zero 
is 4he t for matched pairs or correlated samples, which i^nvolves matc^iing ^ 
each student's pre-test score with his /her post-test score. Students who 
do not have scores on both tests should be. excluded from this analysis. 
Usinr: the t^ f or independent p:roups on these data is Invalid and is" also 
leso li::cly to reveal statistically sirnificant results. 



Confuaion between etatietiaal signifioanae and caeual relationehipa. 
;»ferentlal statistics are nothing more thato tools for determining . 
whether observed differences. are more likely to be real or random. * 
They do not, in themselves, tell us whether career education is the 

: °^ *^^^^erences. Some evaluation designs allow more confidence 

that "career education wafe the cause -than do others, but we*can never 
be certain that career education was 'the only, or even the major, 
factor influencing the results. - 

. The most frequently over-interpreted statistical results are 

those of pre-post evaluation designs without comparison groups.. For ' 

example, a discussion of the results of a matched-pairs t-test 

yielding a probability level of .OOljVent as follows: 

The questiwi, of course, remains^ as to whether a comparable 
group without exposure to a career education program could have 
experienced like incre'ases. Every study of which this writer is 
aware after ten years in the business of statistical inference 
suggests an answer of "no". Had the results' of the study yielded 
a probability that such gains could have occured twenty, ten, even 
five times in one hundred by chance, this evaluator would be in- 
clined to greater consciousness. But whep the pissibilities for 
chance to operate climb into the thousands against being operative, 
a rather firm conclusion evolves that the results were caused, 
and. caused by the program's impact. 

These evaluation results give us a high degree of confidence that 
the students' superior post-test performance was real, rather than the 
result of random score fluctuations. However, the probability level 
tells us notMng about the cause(s) of the improvement and it gives 
us no reason to believe that similar gains would not occur without 
career education. The magnitude of the gain, which is' only one factor' 
influencing the statistical significance level, would have been a more 

°^ above discussion. That is, other research may 
have indicated that a raw' score mean gain of 10 percent over the period 
of six months was remarkably high for this particular test, lending 
credence to the hypothesis that the Improvement was not a simple function 
of the students' beirfg six months older. 

Confuaion bettjeen atatiatioal aignifiaance and educaticfnal aig- 
nifioance. Statistical significance can be and often is achieved with 
a difference between means of one raw score point or less. Thus', the 
magnitude of the gain or of the difference between groups should be con- 
sidered in terms of its practical significance to educational priorities. 



The responsibility of decidihg whether given level* of student 
Justifies the maintenance, expansion, , or replication of a program lies 
inore appropriately with school administrators than with evaluators. How- 
ever, evaluators can present the results in ways which facilitate 
these decisions; several suggestions on this topic are offered in the ^ 
Cheokliat.' Aiso, evaluators should avoid expressing- excessive zea!^ over 
small but statistically significant differences, as this practice mis- 
leads some readers and leads other readers to question the evaluator's 
Judgement. . ' 




-28- 



- " . PART II 

UNIQUE SOLUTIONS TO eOMI40N EV.ALUATION PROBLEMS 

Our review uncovered a number of high-qvJ^nty evaluati^Tl^rategies , 
many of them quite unique. Those which are most likely to apply to other 
programs are presented h^re as a demonstration of the , growing 'capacity of 
career education eva4uation an^d in hopes that they and equally ^creative 
evaluation strategies^ will be applied more frequer^tly in the 'future. 
Stiident outcome measurement techniques are not included here, .but_Bre 
discussed in What Does Career Education Do For Kids? 

Problem: As a result of their association Vith the project , the 
third-party evaluators have made observations and formulated recommend- 
ations which are not based on "hard" data. They ^ant to share their - 
observations in the final report, but the evaluation plan ddes not indicate 
this as the, evaluators' role. -v. ^ " • - 

Solution: The evaluators prepare two final reports, one based on 
objective data ^rlT^nother clearly identified as irfformal, subjective 
observations of the project's functioning and results. (9)* 

Problem: A simple system is needed for showing the project's 
activities in chronological order. and for relating them to the operational 
plan of the funding proposal. 

Solution: A chart identifying each major activity and its planned and 
actual start and completion dates. (2) ' - 

Problem:- Counselors,' teachers, and administrators are all trained 
together. Is the staff development program weighted in favor of one of 
these audiences, or is it equally well-received by all groups? 

Solution:' Participants rate the quality of the program on a five- ' 
point scale for each of several questions. The mean rating of each 
group is computed for each question.^ The one-way analysis, of variance is 
applied to determine whether the groups' opinions of the prof:rani'' differed. 
The same type of analysis wa^^^applied to questions concerning, knowledge 
of and attitud'Sj toward career education. (3) 



♦Refers to' list of referenced projects following this se?^ion. 

3: 



-29- \ 

Problem: A sanple of teachers in each of several schools'is asked 
multiple-choice questions concerning their involvement in career educa- 
tion and their perceptions of the project's impact on the school. Are 
there significant differences among schools? ' — > 

* . Solution: Chi-square analysis of the number of individuals in 
each group giving each response. ,(ll) 

Problem: Measuring incremental quality improvement of a career ed- 
ucation progr^ in terms of its administrative structure and the support 
of various groups; 

* Solution: An incremental cjuality improvement model specifying 'I'ive 
stores of "organizational infusion," a process throuf^h which externally- 
funded projects are institutionalize to the point where the distf'ict or 
school assumes total responsibility for the program's continuation and 
further development. The model is based on previous research concerning 
the factors influencing the^ long-term ^ success of -innovative progfagis and 
specifies standards for each stage of improvement- (l) 



Problem:* Identifying changes in^ the amount of career education being 
taught TiO\t with respect to the amount taught before the project^* began 
when no baseline data were collected. 

Solution; Ask teachers (counselors, etc.), how their current in;; 
volvement compares to their previous, involvepient in career education. (8) 
In this case teachers were asked to compare their present involvement 
with students to that of three years ago^ for each program component. For 
ex6U7iple : - • 

Self AwarenesB:^ Awareness of self (and others) as individuals 
who liave certain likes and dislikes , abilities and dis^abilities , 
feelings, and va;iues. 

a. Very much higher than 3 years ago 

b. Higher than 3 years ago 

c. About the same • . 

d. Less than Syears ago • 

Colle'cting implementation data before and after the program is 
preferable, but this approach is reasonable in cases where baseline 
data have not been or cannot be obtained v * * 




EKLC 



3i 



• Problem: A pre-post with comparison group design is desired for 
student impact- assessment, but it is impossible to predict at the 
beginning of the yea?: which students will aiid will not become involved 
in the program. ^ . 

Solution: Administer the pre-test to a large number of students. ' 
Toward the end of the year identify the career education group and the ' 
coinpa'rison groAp' and administer the post-test. If the groups consist 
of the most and least-exposed students, some who were t^re-tested need not 
be .post-tested. (U) . 

Problem: At the secondary level some teachers and counselors 
are highly active in career education while others are not. Therefore, 
most students receive some career education but the amount varies a 
great deal from student to student. How can the amount of^career educa- 
tion be measured and how can its effectiveness be evaluate|? 

. Splution: Administer a student questionnaire yielding an^interval 
orjatio-level measureCbf the individual's amount of career education 
ei^erience. Calculate the'pearsoni^T correlation of the -treatment 
measure with the outcome measure and test the significance of r. If r is 

■ significantly greater than zero, a. statistically significant relationship 
exfsts between the amount of career education a student experiences and"' 

^is/her t6st scores. (ll) ' 

' This .design is appropriate where student exposure Uo career educa- 
tion is highly variable and normally distributed and therefore where 
classifying st^udents into career education and 'comparison groups would 
be artificial. If ^several treatment measures ^15, be meaningful (such 
as the number of shadowing experiences, group counseling sessions, etc.), 
multiple regression analysis can be used to test their combined effects 
and also. to determine which activities are most highly correlated with 
the outcome measures. 



r 



.3€ 



-31- 



Problem: Getting enough information out of student test re- 
sults for^use in iidiproving career education instruction. 

Solution: Analysis of response 'patterns to each test item (7). 

Compute the proportion of career education and comparison groups 

answering each item correctly. Establish a standard of acceptable 
>- , ' 

performance, such as 75 perpent of students answering correctly. 
Divide the test items into four groups: 

- 1. Botlj groups performed well. These items represent con- 
. cepts which apparently are acquired without career education and 
eliminating these concepts from the career education curriculum may 
be desirable. 

2. Neither group performed well. These items represent con- 
cepts which are not being acquired through the program and which 
require further emphasis. 

3. Comparison students performed well and career education 
studehts performed poorly. If^^any items fall into this- category , 
attemi^s to convey these concepts may have been counter-productive 
and the curriculum i^ probably in need of considerable revision. 

Career education students performed well and comparison stu- 
dents performed poorly. These items represent concepts which are 
not acquired without ca^er education and which are beinp, conveyed 
succjp^s^fully through the present career educatioij curriculum. 

' This type of analysis is valid only if there is. very good 
reason to believe that the career education and comparison groups 
would perform ecfually well without the program. Analyzing test 
results in this manner is especially important for test scales 
measuring a large Variety of career education concepts but may 
\e useful even for tests addressing a sin/^le objective such as career 
infojhnation. * 



■ -32- 

: Problem: Establishing group equivalence and applying powerful 

statistical analyses when no. pre-test scores are available. 

Solution: Use^'oTaptitude or achievement test scores or grade 
point average as the covariate in the analysis of covariance (lO). Any 
measure which correlates highly with post-test scores but which is not 
affected by the career education program is appropriate for use in. the' 

-analysis of covariance model. If there is a possibility that career 
education may affect the covariate measure (such as reading achievement), 
the measure should be taken priir to the program. Since appropriate 
covariate measures are often" available from school records, this eval- 
uation strategy is an excellent approach when the evaluation is begun 
after the program is In progress. 

Problem: Is. the curriculum effective regardless of who teaches it • 
to whom? ^ * 

Solution: A two-way analysis of variance design, with time (pre- 
^ost) as a within-subjects variable and class as a bet-ween-subjects 
variable (6). Each class experiences the same curriculum, but they^may 
be different in terms of teacher and student characteristics. The resufts 
of each effect are interpreted as follows: 

Time: Does student performance improve? 

Class: Are the classes different? 



Time x Class: Does the amount of improvement vary from Qlass 
V: to class? 



Problem: Demonstrating the. cost-effectiveness of a placement pro- 

greua. 

.Solution: Analysis of the tax dollars generated from the employ- 
ment of students placed through the program (5). 



REFERENCES 



Project Title, Director Funding Source & 
and Addres s Grant Numb er 

1. ^ I-ECC Career Education Incremental 

lErproyement Project 
Lucinda L. Kindred, Project Director 
Industry-Education Council of 

California ( I-ECC ) 
1575 Old Bayshore Highway 
Burllngame, California 9^010 

2. I Believe in Kids - OCE 
Albert Thomas, Jr., Project Director G00T50229U 
Jefferson County School Board 
P.O. Box U99 

Monticello, Florida 323UU 



. OCE 
GOO7503735 



3* Illinois Career Education Area Service OCE 

Centers (Urban and Rural): '^A Vehicle GOOT503UoU 
for Demonstration 

Carol Reisinger, Project Director ^ 

Illinois Office of Education 

100 North 1st Street 

Springfield, Illinois 62777 

k. Career Resource Project Part D 

Joe W. Roth, Project Director ' 0-73-5312 
Indiana State Board of Vocational- 
Technical Education 
1*01 Illinois Building 
17 West ^^arket Street 

Indianapolis, Indiana U620U ' , 

5* An Exemplary Program for Career - Part D 

Education ^ 0-.73-5308 

John J. Vandersypen, Jr, , Site 

Coordinator • a . 

Natchitoches Parish School Board 
P.O. Box 16 

Natchitoches , Louisiana 71^57 

6. New York State Consortium for , ^ OCE 

Career Education G00T5023^ 
Dr. Gordon E. Van Hooft, Project Director 
Nev York State Education Department 
Albany, Nev York 1223^ 

7. D.E.C.E.M,— District Eleven Career OCE 

Education Model ^ ' GOOT503T32 

Mrs. Wimell Thomak, Project Director 
Community School District #11 
1250 Amow Avenue 
Bronx, New York IOU69 



Project Title, Director Funding Source^ & 

and Address Grant Number 

8. Comprehensive Career- Education Pro^j&s^^ Part D 

In Springfield Public Schoolf' - ^^."'^ ' 0-73-5288 

. Donovan D. Kimball, Project Diffector'^ic ' ' . 
Springfield School District ■ if' 
525 Mill Striset \ '1:^• 

Springflel&, Oregon 97^77 ^ 

9. Language Experience Based Awareness Part D . 

Hands On Exploration + Competency Based 0-73-5272 
u Preparation = 'A School Based Total 

Career Education Model . ; 

Edward H. Lareau, Jr. & Clifford A. 

Bayliss, Jr. _ 
Admiral Peary Area Vocational -Technical 

School 

Rt . U22 W. , R. D. /)'2 ' 
Edensburg , Pennsylvania 15931 

10. Care.er Education for Gifted and Talented OCE 

Students =^ G007502316 

^ William W. Cox, Project Direx:tor 
Hlghllne Public Schools #Uoi 

15675 Arabaum Blvd., S.W. - 
Seattle, Washington 98166 . ' . 

11. Hlghline's Career Alternatives Model Part D 
Dr. Ben A. Yormark, Project Director 0-73-5289^ 

' Hlghllne School District ffhOl 

15675 Ambaum Blvd., 3.W. ' t>. 
Seattle, V/ashln/^^ton 98166^ 



' . • PABT III - 

A. NOTE T'o(EEajiECT DIRECTORS ON HOW TO GET THE MOST FROM YOUR THIRD PAStY 

A third-party evaluator or evaluation agency ca|i be a teal asset 
to a program, but they can also be an expensive nuisance. Which exper- 
ience, you have depends largely on the evaluator(s) you select and on 
the W8^y you'vork with him/her /them. 

K common method of selectirtg an evaluator is ^to issue a request for 
proposals asking bidders to write an evaluation plan based upon the . \ 
project's funding proposal or an abstract of it. The successful bidder 
-is the one who seems to propose a reasonable plan at a low price. This ' 
approach has its merits, but it also has serious disadvantages. 

The best evaluation plan is one wl>i^.Q,h addresses the information 
needs of various audiences of a proj^^ji *t}^ough realistic means . Fiinding 
proposals tend to be less than crystal c?l%e?r about the information needs 
of the project staff and local decision-makers, even though bidders may 
be able to identify the evaluation questions. of interest to potential 
adopters of your model prof^ram. Also, evaluation ^s^Tgns are virtually 
always compromises of the ideal (from a research point of view) with the 
feasible (taking into consideration local constraints , on data collection). 
Thus, the evaluator needs answers to such questions as, ''Can wq^&get the 
cooperation of a comparison group?*' and, "Can we' count on teachers to 
■record their career education activities?" before preparing an eval- 
uation plan. Since your help is needed in planning the evaluation, this 
st:ep should follow, not precede, the selection of an evaJLuator. 

The selection of evaluators should be taken as seriously as the 
selection of key members of your staff and should be handleci in mUch the 
sane way. First decide what kinds of services you expect. Will you need 
an instrument developer? A statistician? A managemeht consultant? A 
deft report writer? An interviewer? A cbsj^ analysis specialist? Do 
you expect your evaluator to have exper.i^JjjS^ in a particular evaluation 
technique? To.be well-versed in caree^^ducat'ion? .'Most evaluators are 



'good in ait least c.ie of these, but few!?-^^ndiyidual^ (or even agericies) 
excell in .them all. Next, design your request for pi^Oposals in a way 
that will show which bidder has the best capacity *for delivering this 
services you need. Copies of evaluation rej^^orts and evaluation plans the 
bidder has prepared for other programs can be- elucidating." Another 
possibility is to ask the bidder to suggest alternat-l^e"' solutions to" a 
particularly thorny evaluation problem you foresee, such a'^'how to- 



measure the aiaount of career education high school students getting. 

Ask bidder? for coraplete lists of their recent clients a^d follow 
up oa all of them* Your telephone bill will probably be money well 
spent . 

Evaluation .services are expensive and you should be prepco-ed for 
.that, but there are steps you can take to increase your chances of 
getting what jrou pay for: Ask for bidders' fee schedules, of course, 
but also investigate how charges are omputed. OtJfcrwise, you could 
end up paying for a clayls services for two hours of \oj4rr-vDr you and 
another client may both be charged in full for developmental\ork that 
applies to both projects. Contractors have even been known to cii^ge 
two projects for the full travel cost of a single trip. 

Project directors usually know how much they can afford to spend 
on evaluation, so asking for bids per ae is to little advantage; it is 
better to find out how much quality, as well as quantity, each bidder 
can deliver for the price you have in mind. A good rule of thumb for 
budgeting a thorough external evaluation is ten^ percent of the total 
grant award. If this cqnnot be arranged, plan to perform some eval- 
uation tasks internally, perhaps using the external evaluator more as 
a consultant and auditor than as a developer, data collector, data 
' analyzer, and reporter. Remember, too, that doubling an 4valua^ion 
budget more than doubles the services the mouey can buy, as there co-e 
certain fixed costs such as travel-, planning time, and report preparatior 
wli^chVary relatively little with the amount of the contract. 

,Once an evaluator is chosen, a written and notarized contract is a 
good idea. Settle f or^ a grant arrangement only if you have the utmost 
confidence in the integrity and reliability of the. individual or agency 
you have chosen. It Is usually preferable to insist on ap accounting 
of all charges and to pay for services only after they are delivered. 
Make sure that the contract covers the possibilities of the contractor's 
over- or under-expending the budget and stipulates that 'the final pay- 
.ment will be withheld until the final evaluation report is submitted in 
'acceptable form.- If the individual or agency hag a reputation for 
tardiness you may want to consider a stipulation that the contractor 
pay the project for^.every day between, the final report's deadline and 
. your receipt of It. • 



X -37- 



The earlier the signing of the contract, the better. Delays in 
instituting evaluation systems seriously compromise the quality of the 
final evaluation and also leave the program without important feedback 
systems when they are most needed, during the early stages of develop- 
ment. It is best, in fact, to line up an eval'uator when the funding 
proposal is written, particularly now that the USOE is often switching 
to October start-up ^ates. . % 

Time should be allowed for a lengthy meeting with your evaluator(s) 
as earlyjas possible for planning the ev&luation. "JWshe/they by then 
should be ra^liar with your proposal, but will need mo^^JQetails about ' 
the project's plans and priorities and about circumstances in the schools 
and community which will affect the evaluation. 

Ask for a written draft evaluation plan 'soon thereafter. If you are 
not totally satisfied with it, negotiate revisions for the final version. 
The plan should specify the objectives, questions, or hypotheses to be 
addressed^, the evaluation strategies associated with each, a time 
schedule, and who is responsibly, for what. Written plans are good for 
assessing the design's adequacy before it is too late to change it, for 
preventing misunderstandings, for facilitating the preparation of final 

.a 

reports,- and for pacifying federal project officers. 

Program evaluation designs can only be as systematic as the program 
itself. If the program's objectives are nonexistent or nebulous do not 
expect the . evaluator to know what outcomes to look for. . If plans for 
achieving the objectives are ill-defined or if they change daily, do not 
expect the evaluator to determine which strategies are successful. If 
the project staff do not, know which schools and individuals are involved 
in project activities, do not expect the evaluator to know who to ask for 
evaluation data. ^ ^ 

Always insist on reviewing data collection instruments before they 
are put into use, whether they ar? commercial or developed by your eval- 
uator. Certainly the evaluator* s advice concerning the technical souri^ 
ness of an instrument should be given careful consideration, but the \ 
project staff should make sure that the instrument addresses' the program's 
objectives and/or information needs. 



Although this applies to all kinds of data collection instruments, it 
is particularly key for student tests. For example, c are f ul exami n a- » 
tlon.of a test of decision-making skills may reveal- that the test He- 
veloper had a different definition ofTJ^l si on-making skills than does 
your program. If students who give all "correct" answers is not what 
your program i» trying to produce, the results of the test may lead to 
unjustified conclusions about your program's ef fectivenss , and you have 
every ^rif?:ht to veto its use. At the same time, though, real'ize -that 
qiany career education outcomes are difficult to measure and that you 
aretnot likely to find or to-be able" to deve^lop the "perfect" test for 
jrpur program. ^ ' 

Evaluators cannot do their Jobs without a great deal of cooperation 
on the part of the project staff, partic^ilarly in the area of data * 
colle'ction. If the rautually-a{^ref?d-upon evaluation plan requires the 
maintenance of project records, agree upon a system that will minimize 
the burden but be sure that the system is applied coniscientiously. If 
the project staff agrees to take the responsibility of distributing a^d 
collecting questionnaires, follow through on the agreement. Incomplete 
data not only gives evaluators headaches; it also conprom^.oe? the quality 
of your evaluation and usually costs you money for the extra time it 
takes your evaluator to compensate for the problem. 

Communicate regularly with your evaluator. -Frequent site visits 
af« desirable, but if they can't be arranged you should talk on the phone 
and/or correspond at -least monthly. Ask fot prompt feedback on data 
collected and observations made; there is no need to wait until the 
final report to learn of preliminary evaluation findings. Similarly^ 
keep the evaluator informed al)out the program's -^rogress, problems, 
chajiges in plans, etc. He/she/they will need to ^know these things be- 
cause they majr affect the evaluation or its results in ways you may not 
foresee. Besides, evaluators have been known to give good advice on 
occasion. - ' , 

When fina^ report time draws near, meet with the evalutor to plan 
coordinated performance and evaluat^Lon reports. Our review showed that" 
the two reports tend to be redundant, yet much important information is 
left out of both. Both of these problems be avoided -by outlining 
the reports togetlie?. 



If at all possible, ask for a draft final evaluation report well in 
advance of the deadline for submitting the final performance report. Be 
reasonable, of co\irse, and do not expect it three days aft.pr the last 
data are collected, but barring vacations and unusual circumstances a 
month should l)e plenty of time. If you feel the report is incomplete or 
uinfair, ask for revisions, keeping in mind that tlie evaluation budget is 
probably by now expended and that it is unfair to ask evaluators to 
violate their own integrity. ?. • 

We consider project directors not only Justified, but 'duty-bound to 
request corrections of errors such as: evaluators' opinions presented 
as facts, misinformation (such as incorrect report'^ of- the number of 
schools the project serves), interpretations of data which do not take 
into account significant factors of which the jevaluator was unaware, and 
reports o^* sections of reports which make no sense either because they^re 
badly written or because they are lacking in important information. 

. Revisions which should not be requested ar6i chisinges in objective 
evaluation results, omissions of data which are not complementary to the 
project, and eliminations of critical but substantiated comments about 
the project. 

Evaluators often find themselves in the curipus position of being 
obligated to bite the hand that feeds them. With* an appreciation of this 
situation and other problems, evaluators face balanced by a recognition 
of the rights of consimiers of evaluation services, the project director 
can do much to make the evaluator a key contributor to tfi'e program's 
success. 



-41- 
PART IV 

CHECKLIST FOR REPORTING RESL'LTS OF 
STUDENT OUTCOME STUDIES IN 
^AREER EDUCATION 

What the Checklist Is and Is Not 



4e 



The checklist is a guide for ensuring that evaluation reports coqtain-all of ^ 
the information needed by others in interpreting ^rep o r t ed results. We recognized 
the need for such a tool af ter ^evitJwing a number o^f evaluation reports of career 
education programs funded by the U.S. Office 'of Education in the past and finding 
that fhese reports frequently lack such essential information as the number of * 
students involved in the evaluation or the name or the statistic applied to the 
results. These errors of omission are almost certaialy the result of oversight 
in. the majority of cases, so it seemed that a systein for checking the draft^^j of 
evaluation reports would be useful. • * 

« The checklist is not- a dramatic breakthrough fn the)\^eld ot career educa- 
tion evaluation. It does not delve into the design, iniplementat ion ,\or cinalvsis 
phases of e val^ua t ion . ^ It places no judgement on the quality of variouss evalua- 
tion strategies, but rather deals with the most popular designs, bo they gt^od, bad, 
or indifferent. Nor is the checklist a complete guide to repo r t -wr i t ing ; it con- 
cerns itself with content while leaving style and format to the choice of the 
author. , 

The checklist applies only to the sections of evaluation reports concern ing 
programs* impact on students. It was designed fc^r. the typical , student outcome 
evaluation where paper-and-penc i 1 test scores of one or more g^'^ps of students 
are ^alyzed utilizing inferential statis'tics. If your program^Rfe using a more 
uniqu?lS^aluat ion approach, the checklist still mav bo useful, but some of the 
item^ wi^l not apply. ^ * 

Some will be of the opinion that the checklist calls' for too much technical 
inf ormat ?.on . As. we see it,' an evaluation report should be meaningful to anyone 
who may have an interest in your career education program including such diverse 
groups as the local school' staff, the Chamber of Commerce, the I! . S . Office of 
Education, and Educators and non-educatort; natitmwide. Most of these people are 
not sta t is t ic ians* bu t some are, and those with tt'chnicwil expertise will tend to 
■*be skeptical of reported results if kev technical i n f c> rma t i onr is not included in 
a report. It is usually possible to present the mt^re teclinical material unobtru- 
sively in pa rent he t i^' a 1 comments, fotUncUes, rind lables wh*i preserving the re- 
port's readability and utility for diverse audiences. Some* it; ems included, in the 
check] ist " are redundant, such as the number students invol\^e(i in thej^stiuly and 
the degrees of freedom associated witfi t he *s t a t i s t i r , hut tlie iru lusi(jn nf hnth 
" pieces of information will enhance tfie r ed i : I i t v n\ the evaluation in the eyes 
of many. 



If this is your concern, refer t the S.n.:. :^ ih 1 i ^ t i mm s : 
A Practical (Uiide to Measuring Project Ir.p.ict ..n Sp^lfiit Ac h i e_ve_nen t bv 0. [Irirst , 
K. Tallmadge, and C. V.o^d of RMC .Kese i r ch i > rr r^^iM-, 1 7 , KH H: numhor 
KD 106376. 

F. v aj u a t i o n c ind Ed( ica t ionaj J)y J_s J_oi^ - ' ' a k i n : '■, ; : v i c : m ] 'Mile t o j . ' / a 1 ' i / 1_ t ' i n v 

Career VdiuMt ion hv M. B. Yonnu an.! . , /• J.-.'^- ! n:)'-cnt Associat(s, I nc , 

M7S, FR fc nunber FD 117 185 



ERIC 



^n, individuals, particularly advocates of stringent research designs 

w 11 fjnd the standards of the checkl ist , too low. There is n J .ent io^ or .xar ' 
to Lin rtT °f ^i^y of variance. Such omissions should not be taken 

to mean that such information is considered irrelevant." feather, the checklist i^ 
.eared toward improving the reporting of evaluations as they are oLoalv ondu ted 
sho M.T' ^'^"^^^^^^ P-S-'"^ ^hi^ time. Thus., the items of theThecklist 
should.be y.ewed as essential . 'but minimal; exceeding these standards is co^^d- 

V 

The checklist was designed for use by' career education program staff and bv " 
cT he^^h" U-'"'r'?^ draft .evaluation reports-. ff used in tSis waj! a copy ' 
of the checklist should be made for each student imfact study, usaally defined 
as one scale of a test administeted at one grade letel. In this way/independent 
c;^-cks can be made of the thoroughness of the reporting of the evalu^t on co^po- 
nen s associated wi thl eai:h' outco^^e objective. As th^Veport is reviewed . oEeck 

sho ,H K K'\'i'':f '''"^^ f"""'^ ^" reporVa numbered checkl . ! en> 

hou ^be checked of/ only if all applicable lettered items under it appea In^ he 



In a 



•of . 1 ^'^^^^'^i'^^" Che checklist also may prove useful in earlier stages 

•of eva uation by serving as a guideline for earing that evaluation plans nSude 
the. CO lection of all data and the performan/e of all analyses needed for the re- 
port, for judging the quality of j,ast report^ prepared by the bidders for an eval- 
uation contract, and for outlining or writing evaluatiort reports. 

The career educator who is not also a statistician may find several unfamiliar 
terms .n check ist items 6 and 7. In many cases it still will be possible to re- 
the information if it is contained in a report.* For example.' if a table 

what , .s to know that it is reported. We do. however, advise caution to the 
non-statistician in concluding that particular statistical data are missing since . 
manv statistical terms go by several names. Thus, apparent omis^I^n^o a'tech- 
n ca nature should be discussed with the author or other specialist before firm 
conclusions are n^cheJ concerning the report's status of items 6 and 7. 

Following jhe check\ist itself is a further elaboration of the rationale and 
St a'vTn" r H ^^^^/^^^l^li-^ it--' After that is a fictitious student impact 
studv report designed to demonstrate how the information associated with each 
itiocklist Item may appear in an actual report. 



-43- 

CHECKLIST FOR REPORTING RESULTS OF 
' ' STUDENT OUTCOME STUDIES IN 
CAREER EDUCATION} 

The career education objectiv^Cs") assessed by the stvidv 
The career e<^ucation program -whose impac*' is being "assess^ed 

a. Conceptual approach ^ 

b. '^" Staff training in career education ^ 

;c . Student activities 

The students involved in the study . 

a- Method of selecting stujde.nts^ f or each group 

__b. Grade level ' 

c. Number of students in each group (N) 

Unique characteristics of students involved in the stady 

e. '^If a comparison group is used, evidence that the comnd r ison 
group is comparable to the career educat^p^a group 
The measurement tool . ^ 

If you use a commercially-available test, include^ * 
a- The full name ofthetest 

b. The name or number of the form ^ "^"^ 

c. The publisher , \r\^' ^' 



1^ 



« ■ 

d. The name of each scale used in^ C*h0 evaluatioi 

e. A description of the specifjiQ^kil.ls ^ <^tt*iuJes^,- or knowled^o 



measured by each scale vVt ^ 




f. The kind of score analyzed ^^V.''^/^/ ' ''V" ' ■ ' ^ 

If yoo use a, locally-developed test, inciueie: * .^^ = ^ ■ ' ^ 

___g. The name of the test \ . 

h. The name of eac^ scale ' 

i. A description of specific skills, /attitiidheS; \orV;knp^ 

measured by each scale ., - ' * ' 

j . A copy of the test -.'[-^ , --.''] ' ' ; " 

k. The scoring key ' V"- *' ■ ^ - ■ ;. . 

_ 1. A description^ of test development proc<^d"Urfe-sV * \ ^ 'i 

ra. Any aval 1 able in format ion conce rn in^ >e I i'^tj i^'ln^ y. ;^^nd^ yr3 1 i dl r y 

Test admin is t rat ion * t ■■' ■• ' *' ^ ? ^ ' ' • 

a. Dates of testing ' '''■\ ''y*-^ ^ \/ ■ ' ] ' 

b. Test ing , procedures ' V.*' ^. * ^ '•. 

c. 'Rationale for any elimination of scores ;^be fpiri- ■ aria I y si s ' 



_6. Descriptive statistical results . ' 

a. 'Group means 

^b. Standard deviations - s 

_7. Inferential statistical parameters and results 
If you use 'a jt-test this includes: 

__a. Whether -the independent groups or matched-pairs t was used ' 

, b, Whether the test Was one-tailed or two-tailed 

1, c. Degrees of freedom (df) 

d. Value of jt 

^e. Value of p • • - 

If you use the analysis of variance or the analysis of covariance this 
^includes : 

f- The analysis of variance design I . 

g. Degrees of freedom 

' h. Value of F ^ 

i* Value of £ ^ 

j* If significant differences are found and the study involves three 

or more groups of students, a test of multiple comparisons 
/8. Int^.|-p;:etation of results * * 

^ . ^ ;TtieV meaning of the statistical results 
JijiJ^V ^"^'^^ of the reason for the results 



^j^^^'llThe ^i^^^^^^al significance of results 

a. : TGlj^m.;^ tables 

' • numerical data 



f 



Expranation gf Checklist Items 



Z. The career education objective(s)^^ assessed by theystudy 

The career education objective provides the rationale for^he evaluation; 
It should be stated in terms vof what should be different or better about students 
as a result of their career education experiences. The same infcTrmation can be 
conveyed in the form of evaluation questions or hypotheses. 

2. The career education program whose impact is being assessed 

Measurement of -the, extent and nature of career educat ion^^posure is for 
some programs as large a problem as measurement of student outcomes themselves. 
However, if evaluation results are to have utility either within or outride of 
the school system, it is critical to describe the career education program as - 
thorougfil^ and as quantitatively as possible- 

a. 'Conceptual framework " ^ 

The conceptual framework of the K-12 program is usually described in 
the body of the project's report rather than in the evaluaCion report, but 
it is included as a checkl ist i tem because it is important in interpreting 
outcome results. The descripTl^ should indicate for each grade level or 
grade level grouping: I) the major ob j ective (s) of career education (e.g., 
the development of self awareness), aind 2) the global implementation strategy 
■ (e.g., classroom infusion) . 

fc. Staff training in career education . ' 

"Staf f " , as used here, refers to the- individuals who "teach" career educa- 
tion to students and may include counselors, librarians, community resource 
persons, parents, etc. Staff training data h^ve two .purposes for student im- 
pact studies: 1) as an indication of the resources required to jSroduce the 
observed student effects, and 2) as evidence That staff are familiar with , 
the career education model which they are presumably implementing. 

SonJK^uide lines for reporting staff training data in conju^tion with 
student impact studies are: 

1. ^Data should be presented for the specific individuals who delivered 
career education to the students included in ttie study. This means 
that data such as the number of teachers in the district who have 
participated in Ijj^rvice sessions are relevant only if students are 
selected randomly from all classrooms in t:he district, or if all stu- 
dents in the district are included. Where students are selected for 
outcome measurement because they arfe in classrooms where career educa- 
tion is used extensively, the training of their teachers and other 
staff who "taught" them career education is of interest; district-, 
wide training data have little to do with those particular students' 
outcomes. This is not to say that data concerning the extent and 
nature of the project's staff training program should not be col- 
lected and reported where student impact ass-essment focuses on the 
^ "most , e\posed"^ student s , but that these data will not fulfill the 

purposes for reporting staff training in conjunction with student 
outcome studie-e . 

5C • . 



2. The time' frame as well as the amount of staff training should 

• be specified (e.g., 25 hours of training during the past two vears). 
If the amount is not presented iq standard time units (such as hours 
or days) it should be translatable to time (such as ihe number of half 
day sessions) . I • , 

3. The source°(s) of data should be indicated. 

4. Training strategies and topics also should be addressed, but not 
necessarily in the evaluation report. . ' 

'rjtudent aqtivities \ (f' 



Like staff training ^data , infori^tion concerning the amount! and type*of 
career education experienced by students'^should be given for the specific 
students involved in the impact assessment. Again, th6re are often good 
reasons for determining the average amount and type of career education the 
district s students have experienced. However, such dgta are relevlant to 
measured student outcomes only if students are selected randomly fr'om the 
school or district for outcome measurement rather than on the basis of their 
participation in a particular set of career education activities. • 

The ideal evaluation would answer all of the following questions at each 
grade level relevant to tTie student impact assessment; . \ ' 

1. How many and what propart ion.of students have been exposed to ^ 
career education? (This question need be answered only if the 
student sampling plan allows the inclusion in the "career edu- 
cation" group some students who In fact have experienced no 
career education.) • * 

2. What types of career. educat ion activities have these students 
experienced? Activities Should be categorized according to their • 
primary purpose (e.g.*, self-awdreness , job-seeking skills) a,s 'wel 1 ' 
as their .operational nature (e.g.; field trips, role-ulaying) . 

3. How many gnd what proportion of students experienced each type of 
activity? 

4. wa-s the average amount of student exposure each activity 
type? The , amount of exposure should be presented in time units 
or the number of occasions on which the activity was experienced, 
depending on the nature of the strategy. 

. \ • ' . • > 

5. UTiat^was the average amount of student exposure to "career educa- 
tion", or to all activity types combined? 

^ Realistically, all but the first question may be difficult. to answer un- 
les^l.th^-' career education program b'eing assessed consists of vi small number of 
separate courses or discrete instructional units. With f irl 1 rerc.i^n i t i on r.f 
the dir riciilties associated with 'this checklist item, v;e recommend that ava i 1 - 
abl^e data he presented in per-student terns and that consideration be c^iven 
earlv in rhe program to incorporating the- ro 11 ec t ion of these t vpr-s of'data 
into the eva 1 uat i ^n p Ian . A few other pointer^ on this topir are: 

I. The tine frame of student activities data should be stated. If at 
^. all possible the time frame o-f'activitv data slioiild be the same as 



the time frame of the program under evaluation. Some assessments 
address the Impact . of several years of career' education - implementa- 
^ tlon. For example, a three-year-old high school program B^ay be 

assessed by testing twelfth graders. In this case, student activity 
data should incorporate the first year^'s tenth grade activities, the 
second year's eleventh 'grade activities, and the third year 's twelfth 
grade activities. If you use a pre-post evaluation design, activity 
data^should cover the period between' testing dates. ^ 

2. Even if the program is primarily classroom-based, the school is 

, likely to provld^ other career education activities such as ^oynseli^ng, 
assembly pcogramsior school-wide fcareer fairs. Since such activities 
are easily overlooked if student activity data are collected solely 
from teachers, other data sources should be considered. 

. 3. The data source(s) shquld be indicated in terms of who provided the 
information, how, and when. ^ 

4. If your eval\iat'ion involves a comparison group,, it should not be 
assumed that* those students have not experien9ed career education. 
Activity "data should be collected and reported for comparison, as 
well as career education students. * * 

. • \ * 1 • 

5. Very often it .is impossible to describe studeht activities as thor- 
oughly and as quantitatively as you would like, but v^atever infor- 
mation you have ^should be shared with the report's ^aders , even if 
it is based only on in-formal observations of the program. 

The students involved in the study ^ ^ 

d. Method of selecting students for each group ^ ' 

If students are drawn '^randomly'* the description of the sampling tech- 
nique should include any constraints on the randomness of the sample. For 
example, "The sample was drawn to give proportionate representation to the 
academic, vocat ionalS . and general curricula," or "Since students could not 
be taken out of class, only students whose schedules included a study hall 
were tested." It should* also indicate whether students were selected on. 
an individual of classroom basis. * ^ - ' 

If students are selected on the basis of their participation in a partic- 
ular class or program, the question is not, "How were students selected for 
the evaluation?"^, but rather, "How were students selected for the program?" 
This question oft^en is equivalent to, "Why does a student\get this teacher 
rather than anothe'r?", which is answered by describing thk school's class- 
room assignment, practices, or it may be, "Why does a stuqent go to this, 
school rather than another?", which no one is likely tc/ask. Another issue 
may be the manAer in which one of several similar programs, was chosen for 
assessment, which is usually based on one program's relative "intensity and/or 
on the convenienceT of data g^athering. 

b. Grade level''- 

If students from several grade levels are included in a single group, 
indicate the number' or proportion representing each grade level. If the. 
school system does not use the traditional K-12 grade level designatlan^^rs 
give the grouo's age and the bases for assigning students to classroom 
units, jnodules , or courses. 52^ 




Q, tJumber of students in each group (N) 

d. ' Unique aharapteHstias of the students . ' ■ ' ' 

tionlilhlTul'^^^T program for an identified sub-popula- 

tion .within the system., such as -gifted, handicapped, or vocational students', 
this, of course., should be state^d. Since others -will want to use your evalu- 
it aL 1 ^'l f -?"'''"!/'"'^'° replicate your efforts in other settings. 
■ -such as the sic '^--^^^-^e ol%r . less notable student cba^ct.rist ics * 
stuLnts ^ "^t:ure ^ .the community, and any wayfin which the 

students involved in the evaluation are atypical of the school population 
C^uch as predominantly male or be].ow-average achievers). 

e. If , a comparison group is used, evidence t}iat: the comparison group is 
comparable to the career education gr^aStp^ . 

J^^^ J-s needed even if both groups are tested on a pre- and post-test 
basis and the analysis of covariance model is used for analysis. In this 
suffice.'"'' significant differences between groups on the pre-test will 

For the post-test with comparison group design it is essential to estab- 
lish the groups equivalence on educationally-relevant variables if results 
TJZ n/rf ^^'^l^^^ly-. Without such evidence, it is impossible to say . ' 
whether differences between the groups' outcomes are due to the career educa- 
" b^rJnfT 7 '° differences in the students themselves. Better than nothing 
but still inadequate evidence of group equivalence is a statement like. "The 
ernn^^r^'."'"' drawn from schools serving communities of very similar socio- 
economic characteristics." A little better is. "The two groups represent 
the two third-grade classes in^a school where students are assigned randomly 
to classroom units." or "Like the career education group . "the cLparispn ' 
students were drawn from the college preparatory curriculum." HOwevei' a 
•program committing resources to assessing student impact should seriously 
consider devoting further effort to establishing the credibility "of the * 
qompacison group. If the school system has a testing program, scores from 
recently-administered aptitude or achievement tests can be used to test group 
equivalence on these dimensions. Grade-point averages can be u'sed in the 
same way. 

•/. r/ft' neasui'eneyit tool 

Widely-recognized problems of n^easurement in career education make it essen- 
tial to convey precisely, what student attributes were measured. Th^ following 
information, provides a sufficient operational def init ion, of these attribute's to 
allow the reader to draw his or her own conclusions regarding the'^aning of the ' 



^•'Mx are mu 1 1 ip 1 e 1 e ve 1 s or parallel" forms of the same test.) 



-49- 



a. The publisher ir 

* 

If the publisher Is not widely knovm, the full address Is also helpful. 
d. The name of each scale used in the evaluation 

e> A description of the specific skills^ attitudes, or knowledge measured - 
. by each scale . , 

Thl& should resemble the career education objective associated with the 
scale but 1^ usually^ more narrowly defined. FoT ^example, an objective may 
read, "Students will acquire career decision-making skills/'' but the descrlp-- 
t Ion of the 9cale used In measuring the attainment of the objective may be, 
i\ . . assesses the student's ability to select from a list of job titles the 
occupations most appropriate fpr an Individual whose Interests and personality 
traits are given." Publishers' test manuals^ften contain adequate scale des- 
criptions which can be quoted directly. Thlrf cljeckllst Item may be omitted 
for tests of basic academic skills If scale names rle^rly Identify the attrl- 
'^butes they measure (e.g., reading comprehension). 

/. The kind of scoye analyzed 

Examples are raw^ scores and standard scores. 

Additional helpful items a^e:. The number of items on each scales a descrip- 
tion of item types (e.g., multiple choice); a sample item from each scale; a sum- 
mary of the publisher's reported evidence of teliability and validity. 

If you use a locally-developed test, include: 

g. The name of the test 

, Locally-developed tests should be named so that others can reference them 
easily. 

h. The name of each sccfie . 

t. A description of the specific skills/ attitudes, or knowledge measured by 
^each scale 

(See 4e) 

j.V A copy of the test ^. 

: If the test has more than one scale, indicate which items belong to which 
^y^cales. , 

, ) k. . . The scoring key ' •/ , . 

For objective tests this can be indicated on the test booklet. For sub- 
1 j^ctive t^sts- th^ scoring criteria and procedures sh'otiljd be feiven.iri detail. 



Z. A description of , test development- 



« -Include' who was involved in^test development (e.g.^ teachers) and the 
source of concepts tapped by items (e. g. , gracle' level objectives developed 
"by the project staff). f ^ 



'm. Any available in fomatlon oonaerming r>eliability and valid^^ 

. % 

A minimum standard for establishing the validity of locally-developed 
tests is a review by individuals other than the test developers for judgine^ 
whether the test appears to measure the- student attributes it is intended 
to measure. Suqh reviews also can focus on factors which influence relia- 

^"^^ adequacy of the test's length, its readability. ah*.uhe ' ^ 
appr^TSFtn^ness of response scales. If a review is undertaken, the evalua- 
tion reporXshould describe 1) who perfotmed the review. 2) the^criteria used 
in -assessingYhe test . 3) a brief summary, of results . and 4)- the way in which 
results were ^«v^^^ revising the test. Highly desii;able bCit someao^s im- 
practical is fie^f^esting the instrument" before it is used as. an eviluative 
tool. If this is d^e. specify at a minimum' 1) the number and gradei'level 
. of. students inyolvedMn the field-tasting. 2) the analyses applied ta the 

%brieF :sununary of the results of the analyses and 4) the manq<er in 
which the field-test results were used in. revising the test. - ° 

The same test'results used in the evaluation 'itself also can be analyzed 
inr a variety of ways 5or describing psychometric properties of the test. It 
is beyond the scope of thl? checklist to discuss the reporting of such analy- 
ses-suffice it to say that whatever analyses are performed should be reported, 

5. Teat administration ' ■ ' 

a. Dates of testing 

, Indicate yithin a week or two the time(s) of data collecfeon (e.g. "the 
last week in May"). , i ^ • 

h. Testing procedures 

Note should be made of any significant deviations from the administratio?^' 
' procedures stipulated- by the publisher of commercial test^'; 'of differences- 
in procedures in administration for different groups of students. ^ of 
changes in procedures between pre- and post-testing, 

c. Rationale for any etimination of scores before analysis 

' „ * 

Indicate why and. to what extent some students were not 'included in data 
analysis. A common reason is missing pre- or post-test ^scores . 

* * » 

6". Descriptive statistical results " a 

<' . ' ' " 

^If the ''inferential statistic is non-parametric . non^paf ametr ic descriptive 
statistics will be substituted for means and standard, deyiations (e.g. a fre- 
quency table for the'chi-squared; medians and ranges fSr the Mann-Whitney U) . 

■3 

* (.r. Groups neans * ' 

° Present all means relevant to the analysf's with their associated N's.' 
If the analysis oT covariance is applied this includes adjusted post-test 
. means. It is conventional to round means, standard deviations, and values 
. of t and F to the nearest hundredth; more digits than, this are unnecessary 
and confusing. It is also helpful to compute for the reader differences be-- 
tween the .-neans compared in the analysis; present negative differences as 
negative numbers. " 



fc. standard deviations . , * 

It is good -practice to present the standard deviation associated with each 
mean as they, too, can be interpreted by some in meaningful ways. This item 
'may be omitted so long as- the value of \t or the complete analysis o^f variance 
summary table is presented but'--tt" iS preferrable to include standaiii devia- 
tions in any^event. Be c.^reful to .avoid confusion' between the standard devia- 
tion and the variance; ' * ' ^ 

* . •» 

7\ Infer en tisal statistical parametersS^nd, .results ' 

The guidelines below apply directly only to the most widely-used parametric 
statistics, but most, inferential statistic*i^ave parameters analagbus t<5 those 
of the _t and ' In any case the repord? should be very specific about whicli sta- 
tistic was applied, ^including a referefi'cd^ if the statistic is at all uncdtnmon. 

If you use a t-test this includes: 

a. Whether the independent groups or matched-pairs _t hxxs\used 

Although the appropriate t_ should be clear from the design, specifying 
which was used is reassuring. 

fc. Whether the test was one-tailed or two-tailed 
^c. Degrees of freedom (df) 

d. Value of t '"^ 

If the difference between means, is negative (e.g., the comparison group 
scored higher than the career education group) t also should be. negative. 

e. Value of p_ 

Since different people U3€ different standards of .statistical signifi- 
cance, it is best to ^ive the observed 2^value if results meet your stand- 
ards. If you report "no significant differences*', it is important to indi- 
cate the a'lpha level. *If, for example alpha of .1 is used for .three 
independent analyses, the reported values may be : £^-1; £'^.005; p^.l. 

The American *Physchological- Associate (APA) style for ^presentinjg items 
c, d, and e within the text is , ^ "Restt).ts ^.ndicate that the career education 
group scored significantly highet thati the comparison group, t i(48) ^^^2 . 62 , 
£<-01." , ' . , . 

\ . , " 

If yhu use the analysis of variance or the analysis of covariance tk's 
includes : ^\ 

/. The analysis of variance desigri ^ 

' . ' ' - -i^ 

For example, "the one-way analysis of variance for three independent 
groups" or, "tKe one-way , 'analysis of covariance', where pre-tost scores 
^ served as the covariaCpe and post-test scores as the dependent variable . " ^ 

J.. Ceavees o^f freed^r (df) 



h- Value ofi F i * » ' 



* (Seei^^tem 7e) 



< 



tahi^^''?t'*?J^M?i"' ^/i*"^ from^p^es'enting analysis of variance sunnnary ^ 
tal^es. 1 Isr still good ^ractlc,,. %i adequ^ formal frfr a one-way analysts o^ 
variance (or cdvatlancfe) is:..-' ^ . . <i"dj.ysis or 
-<y , . : — ^— '* ' ■ 



Source of Varlanc^e . d£ 0' MS F 

Between\^ups 2' 'l25'!95 fl 3l'. 25* 



Within groups' ^72 ^^"4.03 

total ' . - 74 < . * 

"*£<.05 



and if H H °J r'J^"" '^^^'^^ specified in the text as suggested above 

TnVJi f . deviations associated with each group mean are ^presented ; the- 
analysis of variance Nummary table ma^^ be omitted. The APA style for presenting 
iT^-r'tlr "'^^"'^^ °' the one-way analysis of Parlance ' 

31^ £< is*"' '^^^"^""^ ^"""S the mean scares of the three groups.-F (2. 72) = 

Almost any set of data can be analyzed in more than one way. In deciding 
which analysis or analyses ^to perform .and to report; consider the needs of the 
various audiences of you^ evaluation report. Locally, separate analyses for di6i- 
ferent schools or classrooms may be on Interest, This is fine, but audiences out- 
side of the^school ^^stem will be Interested in t;,he finding ^hat "Ms. Richards' 
class showed improvement but Mr. Thompson's did liot," only if the di'tf4rences in 
the twp teachers career education' strategies are discussed. It is a fdirly com- 
mon practice to analyze outtomes of .s^dents atia given grade* level separately by 
school or classroom without identity^||^n the report any reasons^for differing 
results. Unless the career educat {^^perietices of th« var^s groups are dif- 
tereni;^an,d identified, a summary analysis also should be presented, where all ■ 
career educa-ted students" are combined inBO a sinj^le group. 

rf si^ificant' differences cive found and the study involves three or' 
more gvj\tvB of students , a tesfqf rrultiple comparisons 

.• - If, ftfr example, the evaluation design iticludes two groups of" students 
■ who.,have experienced diff«4:ent types or amounts'of career education %nd a 

comparison group exposed to little or no career education, the F statistic 
^ alone does not indicate which pair(s> of groups are different, and a mul- 
tiple comparison rest is needed. The stati^'tic and values of its parameters, " 
aswelllfes the results, iighduld *be Indicated. 

.Ktcvvvctation of result^e . - ^ ' ' - ^ 

>r -t\^7>!-:*>:.7 stati^&ical vaults " " 



V o f 



J 

Since many audiences of ydur report will be un'awafie of the principles 
statistical inference it is important to explain the meaning of the 
▼ 1 '.• s 



ERIC 



-53--. 

analysis. Two examples are\ "Although the career education students scored 
on the average somewhat higher than the comparison group, the t test shows 
that the difference in scores was probably a chance occurrence . Thus, these 
test results give us no reason to believe that part ic ipat ion in the program ' 
influenced the self-esteem of ninth-graders." "The results of the analysis 
of covariance show that when we account for chance differences between the' 
groups' measured reading achievement prior to the program, we can sav with 
95% confidence .that the superior performance of* career education students on 
the post-test is due to something other than chance. That is, upon -comple- 
tion of the progp&ra, careeiy^ducat ion students read bett^er than we would' ex- 
pect had they nojb particiMtdd in the program." 

b. Your intervvetatiol^f 'thej'reason for the results 

An ineTfective career ec^cation program is only one of many -ptf^^ai'bTe^ 
planations of disappointing Valuation results and although it should not be 
eliminated as a possibility, other factors should be explored. Just a few 
of these are comparison groups which are unlike career education groups on 
relevant variables, measurement instruments which are inappropriate for the 
program or insensitive to instruction and poorly-controlled testing situa- 
tions. Formulating defensible hypotheses concerning which of these or other 
factors influenced the results requires familiarity with the full context of 
the -program and the evaluation — a familiarity most of the report's readers 
will not have. Therefore, your speculation is not only acceptable; it is 
desirable, so long' as speculation is^ labelled ' as such. 

Similarly, career educat ioa experiences rarely constitute the only feasi- 
bljp explanation of positive results^ If circumstances would not permit the 
application of a tightly-controlled evaluation design, addressing the short- 
.comings of the evaluation is more likely to enhance its credibility than to 
detract from it. Reinforcing evaluat^ion results'with other observations or 
research is in orde-i^ especially if you have reason to believe that the results 
are valid bat your evaluation moael does not allow conclusive cause-and-ef f ect 
inferences. If,'^for example, your evaluation of pre- to oost-growth without 

' a reference group yields positive results, the question of whether this growth 
would have occurred without career education still remains. You may be able 
^to'clte other research indicating that at this age youngsters generally grow 

. very little or even regress in this area, as is often th^ case in the affec- 
tive domain. Informal observations of teachers, which alone may be inadequate 
evidence of program impact, also are effective "back-ups" to objective evalua- 
tlo'rN^esult s . Particularly if these observations are in the formof case his- 
tories they make'-the report both readable and ^conv in'c ing . 

'Also, if you have reason to believe a particular career education activity 
or strategy was a majjpr infli^fence on positive res.ults, share this hypothesis. 



Statistical significance can be and often is achieved with a differcMire^ 
'between means of one raw score point or less. Thus, you should no't assume 
that statistically s ign i f leant - resul t s alone d€*nonstrate that., the program is 
worth maintaining or expatiding.^ \>rhereas the . ] udgeri'ent of t[\L' importanc-e of a 
given level of student impact to the school, svs t^r. ' s rr io r i t i ey^ is rnore apprrj- 
pr lately made bv school administrators thSn Idv e v?^ 1 uat o r s .t he"* eya I ua t i on re- 
port can disc'uss the magnitude^ of the,:vinpart in vavs v;hirh faoil irate these 
judgements. Although grade equivalent scores and oercentile ranks sh(Mil('l nnt 

: " , ^ * , ■ ■ y 



-54- 



be used fpr^4fl.ly«l8. group means on natianariy-normed tests may be converted 
r.r^^'! ffi'^'^I purpose. ^ On other tests it may help to convert raw 

^ n^n/!; .f?^ P«rc*ntage scores. Another thing to keep in mind' is that some 
career educit*dh objectives may seem trivial to some members' of your report's 
audience. If jr6a anticipate this problem, it is advisable to explain tSe rele- 
vance of ch« #t»j«i:tive t,o Che overall .goals of career education. 

Closely amcfat«d with, the^uestion of educational significance is that of 
cost effectiveness, A cost analysis may address any one or combination 6f the 
following auestions^ 1) What dl^ 'it. cos^tfo produce the student impact - identi- 
fied by evaluation resyits? :2) What woulTit cost to maintain the program? 
^Lf" i?'! ill*"" °J program's strategies compa,:e to that of otS^r strat- ' 
egies which produce the same effect? 4) Which cast components -^could be decreased 
.or eliminated without seriously affecting student ^impact? It is" beyond the scope 
of the chetJklltt CO delve into this topic, but we would like to stress that cost 
effectiveness analysis adds a v^ry desirable, dimension to evaluation and is likely 
to regeiva more attention in the future. 

•- ^ 

This checkliit item may be omitted if no statistical significance is achieved. 
If results art statistically significant educational significance should be 
addressed. Oftl^iif the results are judged educationally significant is it appro- 
.'^priate to cOlkattAt cost effectiveness. 

8. Final aheoke ♦ 

a. deafly' labelled tables. 

Be sure that tables, indicate what the numbers within the table represent. 

b. AccTurdtely" typed numerical data 



- Careful proof-reading of final typewritten copy is essetitial, as conflicting 
data within a report is common but an easily-avoided problem. ^ 



• if, 

V 



-55- 

Application of the Checklist . ' * ^ - 

Following Is an Illustrative student Impact study report. The report Is 
entirely fictitious; any resemblance of the program, the setting, "the Instruments, 
or the results' to anything existing or planned Is purely coincidental. The student 
Impact report should be considered just one section of. a larger evaluation report 
which In turn Is one section of a project's performance report. 

Marginal notations Indicate the checklist Item or Items addressed in each sec- 
tion of the report. The report discusses three different 'studies' as vTe have de- 
fined them and checklist items are subscripted for sections which pertain to only 
one or two of^^these studies. 




The Fourth-Grade Model Program: • 
Its Imi^act on Students 

^' ■ 

The student impact assessment at the fourth-grade level- focused on 
three evaluation'questions associated with the major elementary Vj ec> 
tives of thef-Career Orientation Project (COP): \ 

1. Does career^ducation affect students' reading achievement? 

2. Does, career -educafion affect students' mathematical achieve.ment? 

3. Does career education positively affect students/ career aware- 
, . fiess? ' V . . • 

Career Education and- Comparison G roups - / ' . 

Lincoln Elementary School iTjudgfed;- the^OP staff the pilot 
schopl with the mqst •"advanced ej-einentafy./ career education program, served 
as the:experimen.tal .site. ;.Thek adiniuistt^tion of Washington Elementary 
-School .declined the invitati6n 'd|, I^^t-;^ in' the 

project but agreed in December to arrtn^e fo^ , the ^school ' s fourth graders 
to serve as a comparison group- The two schools" serve adjacent attendance 
zones of a large suburban confmunity of primarily white-collar workers, and 
both follow the academic curriculum endorsed by the district. Both schools 
have a self-contained.-classroom organization where students are assigned' : 
randomly to classroom units. However, the enrollment'- of Lincoln is about « 
twice that of Washington, with four classroom units at grade four as com- 
pared to two fourth-grade classes at Washington^ 

Since one of the four fourth-grade teachers at Lincoln attended only 
one COP workshop and considers career education a waste of time his stu- 
dents were not included in the evaluation. The other three teachers have 
been active in career education since attending COP's orientation program 
last spring, having participated ;n the thirty-hour summer , workshop and in 
jnonthly' meetings with the COP elementary consultant. 

Each teacher made use of the COP-developed curriculum guide, but the 
emphasis of various activities varied from. class to class, as shown in 
Table A. The numbers of field trips, guest speakers, and audio visual 
activities were . determined from CoP's resouyce center records of the 
period of September 7 through mSv 13; all students present on the days ' 
of these activities participated in them. The number of students who 
shadowed a pai^ent .at work soine.^me during the year was provided by teach- 
ers . who determined the number on the basis of students' oral reports. 

Two 'teachers maintained a daily record of career education infusion 
into content area lessons with a checklist instj;ument formatted as a 

■ 61 \ 



calendar* Infusion was defined as an activity whl'ch^ Biimtiltan 
addressed an identifiable content -area objective* anci 'ain .i^^ . 
career education objective; thus,' the infusion data pf;%l3le 
duplicate the data shqwn for other strategies, Fbr /e)?ampleV a f i 
about the application of fractions to jobs In the '.cbtvst'rxict ion: Industry 
would be considered a math infusion activity as. "v^lT, ^.^.'ian -d " 
information audio-visual activity. , ''^l\'''-^'').^."J.--'- 



Teacher A ascribes, to the infusion strategy* 'and V.^atcofding. to the 

'/'^F . *"''''!•••;''".'!. ^ * ' ' ■ 

COP elementary consultant, alscr applies it extens^v^eiy / /^ she 

^' / ■ . 

found the checklist system burdensome and never. S-uf^^^it^^ monthly 

• " ■ ' ' ^ 

reports^ Since the evaluators* were not cpntracted' ur^til December, the* 
c>iecklist system did not go into effect unti^Jdjiuafy, < The , COP . staff ' 
pointed out that these data for 'classes'- B attd C'may^be^^'^^ pverv* 
estimate of the extent of ijifusiorf. throughout thi^ schopl y^ar because 
the teachers seemed to become more active ip; c.ateer education as the • 
year progressed. * * 

-. The' teacjiers of the com^>ari3on' group , classes; "fi^ere interviewed by 
a member af the evaluation, team "during May; ■in^p^:der' to determine whether 
their students had received^ career education .ln.stru!;:tio.n Both teachers 
were aware of the COP^ program ^nd had TTC^d pf career education, but ' 
neither-had part'icip^ated in any formal trailing iri c^iree'r education. 
Bpth class^^s had experi-fe^ced. three -f ield trips, but clarifying^ questions 
indicated "that they, were > tradit ional **produ.ct"-bri.ented" f i eld trips rather 
than '^career educat:(.*on;" field, trip^. One qf .the teacFiers is interested 
in values clarif icaf idn and estimated that her class had 'experienced about 
one* valuing activity per week^ As.ide JErpm these, cases, the comparison 
studerfis' teachers 'reported no iise of the activities listed in Ta'ble A 
or other coosciou^ effdtts toward' career 'orientation in their instruc- \ 
tion. V V • ' 



-58- 



TABLE A 
Career Education Activities 
of Experimental Group 



Strategy 



\ 



2c 



Class size (In May') 
Values/self awareness^AV. activities 
Occupational informatl^on AV activities 
Field trips 

Guest Speaker^ r^, * / ^• 

Parent shadows (% students) ' 
Infusion (% i^aily llessons) ' 

Language arts 

Social studies 

Math • 

; Others:'"* ♦ \ ■ . * ^ 





1 1 5) c o 




A 


B 


C 


24 ■ 


19 


23 


•13 


6, 


7, 


•3 


' 12 


' 27 


4 


4 




18 


13 


0 


78% 


6i%- 


95% 


7 


24% 


17% 


9 


33% 


40% 


9 


21% 


* 23% 


9 


8% 

f 


13% 



4a 



1.2 



'v.; -Si. 2 
fern 



X 



7 



... ■■11*. X' ■-..; .-^ 



Academic Achievement - ., 

Conveniently, the district's testing program irtvolved the administra- 
tion of the McDonald Achievement Battery. Form GaI to all fourth graders in 
mid-September. The reading comprehension and math concepts scales Vf^ls 
battery, with reported test-retest reliabilities of*82 and .87 respec- " 
tively, were chosen for analysis. ■ • ■ ^ ' ' 

Two-tailed t-tests for independent groups were applied to the pre- 
test scores of the students enrolled in career education and comparison 
group classes irl September. No significant differences wer/ found between 
.the groups on either scale (reading: t(103) - -.68, £>.l; ftkth: t(l05) = 
■19, £) .1). Thus, 'the analysis of covariance model," which accounts for 
raT^dom pre-test inferences, was chosen for_determining whether the tv^o 
groui)s differed on end-of-year achievement*. d ' 

The reading comprehension and math concepts scales of the McJ)onald 
Form GA were readministered to both groups by their teachers in mid-May. " 
If a student's pre- or post-sgore on a given scale was missing, he or she 



1 



ERIC 



McDonald Test Company, Box- 307 , Ptttsville, Kentucky 

• (3:: 



44444 



was eliminated from the analyses of that scale. Sixty^f oyr^.<Sj|*t.er educa- 
tion and forty-six comparison students remained in their pes^M^'ive cla-^jjjlFes- 
throughout the year; thus, the large majority of eligible students i^ere 
included in the analyses. Presented in Table B-are mean tctfyts^f standard 
deviattions, and results of the one-way analysis of covariancf^^.'%rfc^ pre- 
test scores served as the Covariate and post-test scores 49 th'fp dependent 
variable./ , * r ' 



TABLE B 

McDonald Achievement Battery^ Results 
Raw Scores 



Scale/Group 

READING COMP. 
Career Ed 



Comparison 



Mean/ (Std^ Dev. ) ' 

N Pre Post Adjusted 

60 22.64 30.35 30.63 

(6.21) (7.08) ^ 

M 23.41 29.67 29.52 ' 
(5.52) (5.75) 



Analysis of Covariahce 



SV 



Between 



MS 



'l 18,23 1.67 



Within * 98 10/91 £). 1 
Total 99 . 



MATH CONCEPTS 
Career Ed 59 

. Comparison 43 



26.11 33.36 

(7.42) (7.63) 

25.84 29.76 

(8.32) (8.50) 



^.24 



29.58 



Between 



1 52.71 4.23 



Within 9^ 12.46 £^.05 

Total 100 , 



The mean post-test reading score of the career- education group was 
higher than the comparison group's, but even when we take into considera- 
tion the career education' group's slightly lower pre-te«t scores, t^iese 
end--of-year differences are not statistically significant and %Te prob- 
ably due to chance. * However, the evaluators learned aftet becoming in-' 
volved with COP that the district has a state-wide reputa^-ion for its 
excellent elementary reading program^ Achievjgment scores c^nftrm this 
reputation; the grade equivalent of the combined groups' means are 4f3 
for the September testing and 5.5 in May. Thus, it may hav<^ b^^n un- 
reasonable to expect the career education program to improvef measurably 
the alr^eady-high-quality reading instruction of, these youngast^^s . It 
also should be noted that although no positive effects of ghe inclusion 
of career education are evident in reading scores, nor ar^any negative 
effects. 



64 



The career education group outperformed the 'comparison group on 
the math computations scale to a dfegree which the evaluators 'consider 
educationally, a^. veil as statistically 'significant- The difference 
between the groups' adjusted' post-test means slightly exceeds one-third 
of the test's norm group's standard deviation, (3. 66/10. 31 - ,35), a com- 
mon standard for judging educational significance,^^ Put andther way, the 
gra^e equivalent of the career education group's year-end mean score was 
5.3 as compared, to the comparison group' s 4.9^n.^ 

^ There are at least three ways in which the career education program 
may have resulted in improved math achievement; 

1. Because teachers learned mo* interesting ways to teach math, 
by infusing career education into their lessons, they gave 
mbre emphasis to this subject area. : \ , 

2. By recognizing the relevance of math to t^eir present and 
future lives, students became more motivated to learn these 
skills. ^ I ' ' ' : . 

3. . Through inpr^ased " "hands-on" experlerces In -lying math 

concepts to realistic problems, students were more able to 
internalize mathematical skills. " 
Since the Infusion of career education ,iTrt^ mathematics was fairly 
intense and the elementary curr^iculum guide emphasl^s manipulative 
activities and discussions of "real-wotld" applications of mathematics, 
both 2 and 3 above were probably operaV^ng. Also, COP's Elementary 
Consultant reports that many elementary teacfjers have found themselves 

' ■ ' '' i 

devoting more time to math instruction since becoming active in career 
education, although he does not recall specifically whether this com- 
ment was madd by Lincoln's fourth-grade teachers. 
Career Awareness ' 

7 , 

The pragrara's success in developing students' career awareness was 
assessed b^ comparing the career education and comparison groups' year- 
end performance on. the locally-developed Career Quiz. Although no pre- 
test data were collected witff respect to- this outcome, the career educa- 
tion and com^^arison groups' equivalence on socio-economic factors and oh 



2 ^ 
A Practical Guid||to Measuring Project Impact on Student Achievement by 

D. Horst, K. Tallmadge, and C, Wood of RMC Research' Corporation , 197*5, 
ERIC number ED 106376, p. 69. * . 



September academic achievement .make it reasonable tt) assume that ariv . 
observed differences in post-teat scores are probably d;ue to the career 
education program. ^ r*?^-*' . 

The Career Quiz is a test of occupational knowledge developecS^oV^ly , 
by cop's Elementary Consultant and the evaluation team's Iristrument^tf^ . 
Specialist. It consists of 45 matching and multiple-choice' Ite^is di?fei^ed 
to measure knowledge ,o£ the working conditions of a*variety of otcupations 

,^ ' ' . ' ' ■ ' h"^ 

and^the relationship of school subject«e and avocational interests to vari- 

ous occupations. The test booklet and scoring kjey are attached. Job 

titles appearing in the test were selected to represent all 15 U.S. O.K. 

. 'Occupational clusters and all' levels, of educajblbnal preparation. Job 

titles were verified in the Dictionary of Occlipational Titles ^and a varioty 

of materials in COP' s resource center were i^Sd to verify the accuracy ol ■ 

the scoring key. * 

A portion of a t)OP curriculum .development workshop was devoted to 

the review of the Caree^Quiz. ^ The 12 teachers on the team were asked 

to evaluate each test Item by answering "yes'* or **no" to the following ^ 

questions^: . ' 

1.^ Does the item reflect the intent of the district-'^ career 

education ;ef forts at the elementary level? 

Is it free of sex-s te-reotyping? 

Is it free of ambiguity? > 

^* 4. Are the.,format and reading level acceptable for at least 

ninety percent of fourth graders? (Only elementaVy teachers 

were asked this.) ' 

An additional ten cdYeer educators in the ^state performed the same review 

by mail. - . ' \ ' ^ . 

<i« ^ ^ ' ' 

Any item which received more than two "no's" for any jone question or 

more than five "no's" for all questidns^ combined was Te-wrltten Ar elimi-.' 
nated. Fifteen original i^ems were el iminated" o«n the basis oX question i 
and 8^ re re-wrltten on the b.isis of questions 2, 3, and 4. Test re- 
sults are shown below aS^ raw scores. Only students 'who spent the entire 
year in the same classroom were includred in * the analysis. 

, TABLK C , ' ' 

Career Quiz Re'^^nirs , 
Group . ' 'i Std. -Dev. f ■ Mejn 

ZCareer "Education 60' . ^.39 i2.2J 

Comparison * ' 42 ' ^ ^ 5.b8 '\ ^ 12.^6 



A one-tailed t-test for independent groups confirmed '.the s^ignifi- ' 
.^nce of the rather dramatic superiority 9/ the careet-a^atipn group's 
performance.- t (100) 18.93, p<.001. Whereas the com^ison group's- 
ihean ot 33% correct is only slightly above the. t.est ' s "guess rate", 'the 
career education group's average score was 85% correct. Since the im- 
portance of fourth -gradersMf^ossession of occupant ional/l^nowledge may nc 
immediately obvious, the reader is encouraged to review the discussion o 
the COP career development model presented in the perfoUance report. 

These evaluation regults .demonstrate that where the COP career 
education model was applied, it resulted in improved math achievement 
and career awareness and could be expected to' have the same impact on 
other groups of similar students. Furthermore, t?he results do not pre- 
clude the possibility that the COP model may affect reading achievement- 
Hn.a setting where the reading program is in need of improvement. 

Even making the unlikely assumption that the only student benefits 
Of Lincoln's career education program were those addressed in the evalua- 
tion,' we find that the documented benefits were .achieved at a relatively 
low program maintenance cost of about $5.79 per student. ' Included in this 
calculation are the'costs of four field trips for 66 children at $75.00 
.each and $1.23 per student provided by COP to involved teachers for pur- 
chasing- expendable mate^als- . . 

Not included in the matinten^ance cost are initial curriculum develop- 
ment, inservice training, an^i the purchase of durable materials. The main 
tenance of established programs and the introduction of the program into 
more schools will require continued insecvice training and update of the 
cuirricu'lum and the materials center. This can be accomplished fof "the 
district's 12 elementary schools at an annual cost of ^23.000.00 to cover 
both operating expenses and . the salaries. of a curriculum specialist and a 
quarter-time secretary. 



